TikTok's excessive data harvesting program

TikTok used in Australia makes connections to Chinese servers, despite assurances to the contrary

By Internet 2.0

  • TikTok used in Australia makes connections to Chinese servers, despite assurances to the contrary.
  • TikTok checks the user's GPS location hourly.
  • TikTok reads the clipboard and can determine all other apps installed on the phone.
  • TikTok reads the user's calendar and continuously asks for access to the user's contacts if initially denied.

Executive Summary

This report is a technical analysis of the source code of TikTok mobile applications Android 25.1.3 as well as IOS 25.1.1. Analysis of the Android application was performed using a Galaxy S9 cell phone. Internet 2.0 conducted static and dynamic analysis of the source code between 01-12 July 2022. This report aims to analyse TikTok device and user (customer) data collection. Prepared by Internet 2.0, this report is for policy makers and legislators to make evidence-based decisions. TikTok is a dominant social media application and is the 6th most used application globally with forecasted advertising revenues in 2022 expected to be USD12 billion. In our analysis the TikTok mobile application does not prioritise privacy. Permissions and device information collection are overly intrusive and not necessary for the application to function. The following are examples of excessive data harvesting.

  • Device Mapping: The application retrieves all other running applications on the phone. TikTok also gathers all applications that are installed on the phone. In theory this information can provide a realistic diagram of your phone.
  • Location: TikTok checks the device location at least once per hour.
  • Calendar: TikTok has persistent access to the calendar.
  • Contacts: TikTok has access to contacts and if the user denies access, it continuously requests for access until the user gives access.
  • Device information: TikTok has code that collects the following device detailed information on Android.
  • Wi-Fi SSID
  • Device build serial number
  • SIM serial number
  • Integrated Circuit Card Identification Number (this is global unique serial number that is specifically tailored to your SIM card)
  • Device IMEI
  • Device MAC address
  • Device line number
  • Device voicemail number
  • GPS status information (updates on the GPS location)
  • Active subscription information
  • All accounts on the device
  • Complete access to read the clipboard (dangerous as password managers use clipboards)

Also of note is that TikTok IOS 25.1.1 has a server connection to mainland China which is run by a top 100 Chinese cyber security and data company Guizhou Baishan Cloud Technology Co., Ltd.

Introduction

TikTok is currently one of the dominant social media application in the market. It is the 6th most used application. As at September 2021 TikTok has over 1 billion active users globally with 142.2 million users in North America.1 It has been downloaded over 3.5 billion times as of January 2021, with 43.7% of users 18-24 years old and 31.9% 25 to 34 years old. TikTok’s projected advertising annual revenue in 2022 will hit USD12 billion, up from USD1.41 billion in 2020.2

Figure 1: Projected TikTok advertising revenue (see footnote 2)

Figure 1: Projected TikTok advertising revenue.2

Internet 2.0 conducted static and dynamic analysis of the TikTok mobile application Android 25.1.3 as well as static analysis of the TikTok mobile application IOS 25.1.1 to understand user and device data collection.3 The analysis also seeks to confirm the existence of any malicious code or features of the application. We decompiled the source code of the application available on the app stores and analysed it through multiple systems (including multiple sandbox services) and manual source code reviews. This is divided into the following sections: user permissions and third-party data access; device and user data harvesting; and conclusion.

User Permissions and Third-Party Data Access

There are certain permissions that the Android documentation considers to be “dangerous”. They are considered dangerous due to the permission providing additional access to restricted data. For example, the ability to read all SMS messages could be considered dangerous because an application could send all your texts to a server and save the information for future use (such as a malware). Unfortunately, TikTok makes use of a lot of these dangerous permissions. We noted the Android version had many more than the IOS version. IOS has a justification system where to gain a permission the developer must justify why this permission is required before it is granted. We believe the justification system IOS implements systematically limits a culture of “grab what you can” in data harvesting. The fact that TikTok had far more permissions for Android over IOS is a good demonstration of their culture when it comes to privacy.

Figure 2a: TikTok Android access permissions rated as dangerous.

Figure 2a: TikTok Android access permissions rated as dangerous.

Figure 2b: TikTok IOS access permissions rated as dangerous.

Device mapping

The Android application collects all other running and installed applications on the phone (this is an unnecessary function), see Figure 3. Theoretically, this information can provide a realistic diagram of your phone.

Figure 3: Get all applications and running tasks on the device (green highlight).

Figure 3: Get all applications and running tasks on the device (green highlight).

GPS and Locations requests

The Android application queries the device GPS location at least once per hour while running. This command is seen in Figure 4 and Figure 5.

Figure 4: Get location code.

Figure 4: Get location code.

Figure 5: TikTok get longitude and latitude data requests.

Figure 5: TikTok get longitude and latitude data requests.

Contacts

The Android application requests access to user contacts. If the user denies access the application will continuously ask for access. TikTok does this as it runs its code in a loop that if a Boolean (true or false) is stored as false, it will keep prompting until given a true value (see Figure 6). It is normal for an application to initially request access to contacts but TikTok’s persistent, endless harassment for user contacts access is abnormal. It reflects a culture that does not prioritize privacy or a user’s preferences for privacy.

Figure 6: The source code for Contacts information.

Figure 6: The source code for Contacts information.

Figure 7: TikTok Contacts access request prompts while in application.

Figure 7: The source code for Contacts information.

Calendar

The Android application has persistent access to read and modify calendar, see Figure 8. TikTok only uses the calendar for special circumstances, for example when there is a TikTok LIVE event, based on our analysis. The persistency of access to the calendar is excessive in our opinion.

Figure 8: Persistent calendar access.

Figure 8: Persistent calendar access.

External Storage

TikTok Android application requests access to external storage. This is a standard command for a social media application to store video and images. The aspect we list as excessive is TikTok doesn’t just retrieve the ability to see folders it retrieves a list of everything available in the external storage folder where the application has the access to place files, see Figure 9.

Figure 9: List everything in external storage.

Figure 9: List everything in external storage.

Device and user data harvesting

Device Data

TikTok also has potential to harvest an excessive amount of data about the device, it is important to note that due to limitations with dynamic analysis it is not currently possible to determine if any of this data is ever taken from the device, however, the Android application has code that can gather the following additional device details. See Figure 10 through Figure 12.

  • Wi-Fi SSID
  • Device build serial number
  • SIM serial number
  • Integrated Circuit Card Identification Number (this is global unique serial number that is specifically tailored to your SIM card)
  • Device IMEI
  • Device MAC address
  • Device line number
  • Device voicemail number
  • GPS status information (updates on the GPS location)
  • Active subscription information
  • All accounts on the device
  • Complete access to read the clipboard (dangerous as password managers use clipboards)
Figure 10: TikTok Data harvest image.

Figure 10: TikTok Data harvest image.

Figure 11: TikTok Data harvest image.

Figure 11: TikTok Data harvest image.

Figure 12: TikTok Data harvest image.

Figure 12: TikTok Data harvest image.

Of note: Joe’s Sandbox rated the Android application as malicious for Spyware and Evader categories as seen in Figure 13 because of device and user data collection by the application and evasive techniques the application uses to block any type of analysis. Many applications have anti-sandbox run commands now to inhibit automatic analysis, the sandbox identifies these and categories it in the evader category.

Figure 13: TikTok rating as per Joe’s sandbox (https://www.joesandbox.com/).

Figure 13: TikTok rating as per Joe’s Sandbox.

IOS connects to mainland China

TikTok are specific in their statement that TikTok user data is stored in Singapore and the US. However, we found many subdomains in the IOS application resolving all around the world including: Sydney, Adelaide and Melbourne (Australia); New York City, Las Vegas, San Francisco, San Jose, Monrovia, Cambridge, Kansas City, Dallas, Mountain View (USA); Utama and Jakarta (Indonesia), Kuala Lumpur (Malaysia), Paris (France), Singapore (Singapore) and Baishan (China). During analysis we could not determine with high confidence the purpose for the China Server connection or where user data is stored. The China server connection is run by 贵州白山云科技股份有限公司 Guizhou Baishan Cloud Technology Co., Ltd a cloud and cyber security company. The subdomain connected to the “China server connection” resolved in multiple locations around the world including in China. The IP address resolving to China regularly changed locations, however, connectivity to Baishan Guangxi China was visible across a number of different IP addresses over time. This was confirmed through the use of a number of security products and methods, including virus total, Metasploit, security trails and sandboxing. Interestingly, this company has been rated a top 100 Chinese cyber security company and in 2021 established a joint big data laboratory with Guizhou University.4 Of note only the IOS version had this mainland direct server connection. We could not find any direct server connections with mainland China in the Android version of the application.

Conclusion

For the TikTok application to function properly most of the access and device data collection is not required. The application can and will run successfully without any of this data being gathered. This leads us to believe that the only reason this information has been gathered is for data harvesting. It is also notable that the device only needs to ask the user for permission to perform each of these actions once and then follow the user’s preferences. The application however has a culture of persistent access or continuously asking for a decision reversal by the user. The hourly checking of location is also unnecessary. Finally, device mapping, external storage access, contacts and third-party applications data collection allows TikTok the ability to reimage the phone in the likeness of the original device.

References

  1. 15 TikTok Statistics Marketers Need To Know (2022)
  2. TikTok and Douyin will account for more than 5% of global digital ad spend this year
  3. This analysis provides impartial advice for users to evaluate the extent to which their data is collected for privacy reasons. It allows policy advisors and legislators to make evidence-based decisions when discussing privacy concerns with vendors. This report was written for a global audience and does not include any legal or jurisdiction based regional assessments.
  4. https://baike.baidu.com/reference/23443686/44e44NXRi0exRZo-8rbRsVSmZl-hjxLfaZVO4j748emXOcfv_uNtLc1yLac09EyZEBSnmwlHmEjKgrSKyJqfjRJXffvnMrZx3fjyd7KgfZXHQTJqcQiSTTzNcYs12v7vcNN_
    https://baike.baidu.com/reference/23443686/cc63DG_6ZWBsyHhiqR45OVCvsMnuyzIROgdcmvvuXilWB48sb7YhfKhpeWv0ZpsePYpHl2EMcS8LdZe2yWIZPp3rLCUtoQfy96e5-_uuvbQ

Subscribe to Center for Foreign Interference Research

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe