Static Analysis of The DeepSeek Android App

I carried out a static analysis of DeepSeek, a Chinese LLM chatbot, using variation 1.8.0 from the Google Play Store. The objective was to identify potential security and personal privacy problems.

I've discussed DeepSeek previously here.

Additional security and personal privacy issues about DeepSeek have been raised.

See also this analysis by NowSecure of the iPhone variation of DeepSeek

The findings detailed in this report are based purely on static analysis. This means that while the code exists within the app, there is no conclusive proof that all of it is carried out in practice. Nonetheless, the presence of such code warrants examination, especially provided the growing concerns around data privacy, security, the prospective abuse of AI-driven applications, gratisafhalen.be and cyber-espionage dynamics in between worldwide powers.

Key Findings

Suspicious Data Handling & Exfiltration

- Hardcoded URLs direct information to external servers, raising issues about user activity tracking, such as to ByteDance "volce.com" endpoints. NowSecure determines these in the iPhone app the other day also. - Bespoke encryption and data obfuscation techniques exist, with indications that they could be utilized to exfiltrate user details. - The app contains hard-coded public secrets, instead of depending on the user gadget's chain of trust. - UI interaction tracking captures detailed user habits without clear authorization. - WebView manipulation is present, genbecle.com which could permit the app to gain access to private external internet browser information when links are opened. More details about WebView adjustments is here

Device Fingerprinting & Tracking

A significant portion of the evaluated code appears to focus on event device-specific details, which can be utilized for tracking and fingerprinting.

- The app gathers various distinct device identifiers, consisting of UDID, Android ID, IMEI, IMSI, and carrier details. - System homes, set up bundles, and root detection systems recommend potential anti-tampering procedures. E.g. probes for the presence of Magisk, a tool that personal privacy supporters and security scientists use to root their Android gadgets. - Geolocation and network profiling exist, indicating possible tracking capabilities and making it possible for or disabling of fingerprinting routines by area. - Hardcoded gadget design lists recommend the application may act in a different way depending on the detected hardware. - Multiple vendor-specific services are utilized to draw out additional device details. E.g. if it can not identify the gadget through standard Android SIM lookup (since approval was not granted), it tries extensions to access the exact same details.

Potential Malware-Like Behavior

While no conclusive conclusions can be drawn without vibrant analysis, a number of observed behaviors align with recognized spyware and malware patterns:

- The app uses reflection and UI overlays, which could assist in unapproved screen capture or phishing attacks. - SIM card details, serial numbers, forum.batman.gainedge.org and other device-specific data are aggregated for unknown functions. - The app implements country-based gain access to constraints and "risk-device" detection, prawattasao.awardspace.info suggesting possible security mechanisms. - The app executes calls to pack Dex modules, where additional code is packed from files with a.so extension at runtime. - The.so files themselves reverse and make additional calls to dlopen(), which can be utilized to load additional.so files. This center is not typically inspected by Google Play Protect and other fixed analysis services. - The.so files can be carried out in native code, gratisafhalen.be such as C++. Using native code adds a layer of complexity to the analysis process and obscures the full level of the app's abilities. Moreover, native code can be leveraged to more quickly escalate privileges, potentially making use of vulnerabilities within the os or device hardware.

Remarks

While information collection prevails in modern-day applications for debugging and improving user experience, aggressive fingerprinting raises substantial personal privacy issues. The DeepSeek app requires users to log in with a valid email, which ought to already supply sufficient authentication. There is no valid factor for the app to aggressively collect and send distinct gadget identifiers, IMEI numbers, SIM card details, and other non-resettable system properties.

The level of tracking observed here surpasses normal analytics practices, possibly allowing persistent user tracking and re-identification throughout devices. These habits, combined with obfuscation methods and network interaction with third-party tracking services, require a higher level of examination from security scientists and users alike.

The employment of runtime code filling as well as the bundling of native code recommends that the app could allow the release and execution of unreviewed, from another location provided code. This is a serious potential attack vector. No evidence in this report is provided that remotely released code execution is being done, just that the facility for this appears present.

Additionally, the app's approach to identifying rooted gadgets appears extreme for an AI chatbot. Root detection is typically justified in DRM-protected streaming services, wiki.lafabriquedelalogistique.fr where security and content defense are vital, or wiki.whenparked.com in competitive video games to prevent unfaithful. However, there is no clear reasoning for such rigorous steps in an application of this nature, raising more questions about its intent.

Users and organizations considering setting up DeepSeek should be mindful of these prospective risks. If this application is being utilized within a business or government environment, additional vetting and security controls need to be implemented before permitting its deployment on managed devices.

Disclaimer: The analysis provided in this report is based upon static code review and does not imply that all discovered functions are actively utilized. Further examination is needed for conclusive conclusions.