Last month, it was revealed that due to a vulnerability in Facebook, the data of 530 million users was scraped and posted on a hacking forum. Although the company acknowledged the issue, it downplayed its severity and refused to notify those affected by it.
It is important to note that authorized scraping by web crawlers is allowed but using other automation tools to scrape data that violates Facebook's terms of service is not. The company says that any platform with a public endpoint, such as Facebook's website, is a potential target for scraping and while this threat cannot be removed entirely, it can certainly be mitigated.
To that end, Facebook empowers an External Data Misuse team consisting of 100 people that detect and block malicious scraping patterns on a continuous basis. Billions of these suspicious actions are blocked each day across Facebook and Instagram. The company also enforces certain technical limits which restrict the amount of data a single person can fetch from a service in a specific timeframe. Public Facebook datasets containing user data is secured. Although there is no guarantee that unauthorized datasets can be removed entirely and that the offenders can be caught, the firm says that it does everything within its capacity to make both of these things possible. In the past year, Facebook has taken 300 legal enforcement actions including lawsuits, cease-and-desist notices, and disablement of offending accounts.
As an example of its ongoing commitment to prevent data misuse and unauthorized scraping, the firm highlighted one scraping technique that used to be quite problematic, namely "phone number enumeration". Through this strategy, scrapers used automation tools at scale to fetch information about users through phone numbers associated with their Facebook accounts. Until Facebook fixed its contact importer feature in September 2019, attackers would use it to extract information about users in a particular geography and then aggregate and store it in external databases with malicious intent.
That said, Facebook has indicated that this war against illegal scraping obviously means that both adversaries keep updating their tools and that the landscape changes in quite a fluid manner. Facebook's main priority is to prevent unauthorized scraping at scale while still allowing its users to utilize the platform's features and connect with others.