Biggest social media data leak | 235 million effected.

Biggest social media data leak | 235 million effected.

A recent Database leak exposed profile data of 235 million TikTok, Instagram and YouTube user accounts. The data seems to be collected through a method called "web scraping". The company visits the web interface of the service and then automatically organizes the data. This is different from hacking, which involves breaking into the system to access data that should not be publicly accessible. Web crawling only accesses public data.
For example, an automated system can access a number of YouTube channels, collecting the username, photo and number of followers of the channel owner. An entire database of these records becomes a privacy issue even though the data itself is public.


What happens to leaked Data?

Once the data has been collected into a database, it is normally expected to be protected. But TNW reports that a database of 235 million records was found on the web without password protection.
The data collected had four main datasets with details of millions of users from the above-mentioned platforms. It contained information such as profile name, first and last name, profile picture, age, gender and follower statistics 

Now, what’s interesting is that the report shows that security researcher Bob Diachenko, principal investigator at security firm Comparitech, found three identical copies of the database on Aug.1. According to Diachenko and the team, the data belonged to the now defunct company Deep Social.

When they contacted the company, the request was sent to Hong Kong-based Social Data, which admitted the infringement and closed access to the database. However, the social data denied having any ties to Deep Social. Deep Social made the following statement.

“Please note that the negative meaning of data being hacked means that the information was obtained secretly. This is not the case. Anyone with Internet access can use all data for free.”

Comparitech says that each record contains some or all of the following:

Profile name

Full name

Facial photo

Account description

Personal information belongs to the company or there are advertisements

Statistics about follower engagement, including:

Number of followers

Participation Rate

Follower growth rate

Audience gender

Audience age

Audience location




In addition, approximately 20% of sample records contain phone numbers or email addresses. As TNW stated, this type of data can be used for spam or phishing attempts. The terms and conditions of the service usually prohibit crawling, but a California court ruled it illegal last year. In many cases, this may be a good thing.

For example, CityMapper is a very popular application that can figure out how to get real-time traffic and public transportation data from A to B in the city in the fastest way. Today, most bus companies provide this data through API, but in the early days, it was only available on the Internet. Early pioneers provided a convenient way to make data more usable by web crawling CityMapper.

Nowadays, when companies put useful data on the web but not through APIs, web crawling is still useful. For example, price comparison services often still rely on crawling.
However, capturing personal data is another matter, and the court may need to distinguish between the two types of use.


Comments (0)

Leave a comment

Coming Soon...!