This weekend press exposed a significant data leakage containing the records of 533 million Facebook users. The records were posted on multiple cybercriminal forums for free. This incident exposed the personal information of Facebook users, including phone numbers, emails, full names, job occupations, and birth dates. Much of this information was likely scraped from public Facebook profiles, as Facebook alluded to in a new statement made on 06 Apr 2021. However, the leakage also included data that wasn’t made public by users, such as their phone numbers. In this blog we dive into what happened, how the information was exposed, who has taken responsibility for the attack, and the risks involved to affected users.

What is the Facebook Data Leak?

The initial incident started in mid-to-late 2019. It is believed that threat actors scraped Facebook’s website to acquire the information of millions of users. Web scraping refers to the process of using automated scripts or bots to harvest public information from sites, such as any information users make publicly available on their profiles (Names, City, Education, etc.).

Scraping is not a new technique, and it occurs daily. Cybercriminals frequently scrape sites such as Facebook, Twitter, and Reddit, and many other sites. Cybercriminals can leverage the data extracted from sites for a variety of purposes, including spamming, information gathering, and social engineering attacks. They can also sell scraped data for a profit to other cybercriminals, marketing companies, or call centers.

Figure 1: Raidforums user advertising scraped Instagram database

As previously mentioned, data scraped from sites is usually public data. If users set their emails, names, and locations to be public, then that data could be viewed and harvested by virtually anyone. However, the data exposed from Facebook wasn’t your usual data scraping incident. Threat actors were able to harvest users’ phone numbers, even if the users had set their number to be private on their Facebook profiles. Facebook stated that they believed that cybercriminals accomplished this by exploiting Facebook’s “contact importer” feature, which allows users to find other users by using their phone numbers.

This feature could have been exploited by uploading large sets of phone numbers and identifying which Facebook profiles matched the numbers. Facebook stated that this feature was fixed in September 2019, following the discovery that threat actors were abusing the feature. However, while Facebook fixed the feature in 2019, the phone numbers of 533 million users had already been harvested by malicious individuals, along with other identifying information on users.

How Was the Facebook Data Distributed in the Cybercriminal World?

Initially, attackers offered the data at quite a steep price. As the data began circulating in open and gated cybercriminal forums in 2020, a listing on Russian-speaking cybercriminal forum XSS in August 2020 advertised the sale of this data for “only” USD 25,000 (see Figure 2). Listings were identified across several other forums, such as Raidforums. The sheer size of the data leakage and the wide geography it covered (106 countries) made the data a gold mine for cybercriminals. Therefore, these listings often caught the interest of multiple threat actors.

Figure 2: XSS user advertises Facebook leak in August 2020

The XSS user who initially shared the data was allegedly responsible for the attack. When other forum members questioned the origin of the breached data, the original poster claimed that they had exploited a zero-day vulnerability on Facebook’s website. This vulnerability allegedly allowed the threat actor to grab users’ data from their Facebook ID (see Figures 3-4). The user also stated that the data extracted dated from 01 Jan 2020, as Facebook had patched the vulnerability by then. The user did not provide further information.

!2ajdpKZf0REu48NI6VGslP! Figure 3: XSS user claims that they exploited a vulnerability to crawl Facebook users

FIGURE 4: XSS user provides more information on how they claim to have acquired the Facebook data leak

Cybercriminals often purchase data to re-sell it to other cybercriminals for a profit, the cost or set price of the data breach lowering with each transaction. From 2019-2021, the data likely exchanged hands multiple times— an activity frequently observed in cybercriminal forums. Eventually, the data breach becomes devalued, and users will expose it for free to gain reputation or notoriety within a cybercriminal forum. In the case of the Facebook breach, this is the most likely situation. On 03 April 2021, a user on the English-speaking cybercriminal forum Raidforums uploaded the entire Facebook breach for a negligible cost of eight forum tokens (approximately USD 2.52).

FIGURE 5: Raidforums user exposes the Facebook data leak for free

Within 5 days, more than 4,800 forum members had unlocked the data with their tokens; the thread received over 1,000 replies and 200,000 views, making it one of the most viewed threads on the criminal forum. The data was an instant success within the cybercriminal community. The data leakage and free download links have since been reposted across multiple deep and dark web forums. The data can now be easily acquired by any cybercriminals who wish to use it.

What Data was Included in the Breach?

Virtually every individual included in the data leakage had their phone numbers exposed, including Mark Zuckerberg himself and other founding members of Facebook. The exposure likely depended on how much information users left public on their profile, with the exception of their phone number. Any data that was public on the affected Facebook profiles was likely harvested. The dataset typically included the victim’s full names, location, phone numbers, Facebook IDs, the company they worked for, and birth dates.

FIGURE 6: Mark Zuckerberg’s data exposed in the Facebook leak (phone number censored)

Email addresses were also a high-value, sensitive piece of personal data exposed in this leakage. However, not all accounts contained exposed emails— security researchers predicted that only those accounts that opted to make their email addresses public in 2019 were affected. ReliaQuest identified more than 122 million email addresses listed in the data leak. Most of these emails were Facebook.com emails in the format: Facebook_ID@Facebook.com, which were likely emails used for Facebook messages, and not users’ personal email addresses. Therefore, removing these revealed a more realistic number of emails exposed in the breach. The number of email addresses exposed was distributed as follows.

Total Emails exposed (excluding Facebook.com) emails

3,300,747

.com emails

2,602,626

.edu emails

5,997

.org emails

3,428

.gov emails

514

Others (.de, .net, .fr, .co.uk, .ru, etc)

688,182

Table 1: Number of email addresses exposed in the Facebook data leak

What is Your Risk as a Facebook User?

If you believe that your email address or phone number was affected by the breach, you check whether or not your data was exposed with the service HaveIBeenZucked.

Fear not! The leaked data included no passwords, and it is unlikely that cybercriminals can use the information by itself to hack into your accounts. However, users who had their data exposed should be aware of suspicious and unsolicited emails, phone calls, and messages from unknown sources. Considering the high interest that this leakage has gathered within cybercriminal communities, it is highly likely that criminals will attempt to use the data to launch social engineering attacks or spam users with unwanted messages. Call centers may also use this data to continue launching vishing (voice phishing) attacks on unsuspecting victims.While this data may be “old,” it is likely that information has remained unchanged for most users. After all, individuals do not usually change their phone number and email address every year or two. High-profile Facebook users, such as politicians, company executives, and public figures, are most likely to be targeted by attacks, but all affected users should proceed with care. Data leakages such as this one are common, and if your information wasn’t affected by this leakage, it might have been exposed in other incidents. As security experts often say, it is not a matter of if data has been exposed, but when. Therefore, it is crucial for users to always be cautious and exercise security best practices wherever possible.

Annex A

The list of countries affected, along with the number of records exposed:

Egypt

45,183,147

Italy

35,677,337

USA

32,315,291

Saudi Arabia

28,804,686

France

19,848,557

Turkey

19,638,821

Morocco

19,147,770

Colombia

17,957,906

Iraq

17,116,398

South Africa

14,323,766

Mexico

13,330,561

Malaysia

11,675,893

United Kingdom

11,522,328

Algeria

11,505,898

Spain

10,894,206

Russia

9,996,405

Sudan

9,464,722

Nigeria

9,000,127

Peru

8,075,316

Brazil

8,064,915

Australia

7,320,478

UAE

6,978,927

Syria

6,939,528

Chile

6,889,082

Tunisia

6,247,880

India

6,162,449

Germany

6,054,422

Netherlands

5,430,387

Oman

5,048,532

Yemen

4,617,359

Kuwait

4,502,021

Libya

4,204,514

Israel

3,956,428

Bangladesh

3,816,531

Canada

3,494,385

Palestine

3,367,570

Kazakhstan

3,214,290

Belgium

3,183,540

Jordan

3,105,988

Singapore

3,073,009

Iran

3,057,522

Bolivia

2,959,209

Hong Kong

2,937,841

Qatar

2,789,724

Poland

2,669,381

Argentina

2,339,557

Portugal

2,227,361

Cameroon

1,997,658

Lebanon

1,829,661

Guatemala

1,645,068

Switzerland

1,592,039

Uruguay

1,509,317

Panama

1,502,310

Costa Rica

1,464,002

Ireland

1,449,921

Bahrain

1,424,219

Finland

1,381,569

Czech Republic

1,375,988

Austria

1,249,388

Sweden

1,092,140

Ghana

1,027,969

Philippines

889,629

Mauritius

848,558

Taiwan

734,807

China

670,334

Croatia

659,115

Denmark

639,841

Greece

617,722

Afghanistan

558,393

Angola

508,903

Albania

506,602

Norway

475,809

Bulgaria

432,473

Japan

428,615

Macao

414,284

Namibia

409,356

Jamaica

385,890

Hungary

377,045

Ecuador

318,824

Botswana

240,632

Slovenia

229,039

Lithuania

220,160

Brunei

213,798

Luxembourg

188,201

Serbia

162,898

Puerto Rico

138,183

Indonesia

130,321

South Korea

121,744

Cyprus

119,022

Malta

115,367

Azerbaijan

99,472

Georgia

95,193

Estonia

87,533

Maldives

86,337

Moldova

46,237

Iceland

31,343

Honduras

16,142

Burundi

15,709

Haiti

15,407

Djibouti

14,327

Ethiopia

12,752

Burkina Faso

6,413

Fiji

5,364

El Salvador

4,479

Cambodia

2,838

Table 2: Records exposed per country (order from largest to smallest)