Shanes Curries Blog

Return to Blog directory



A Technicians Perspective on the Privacy & Data Tracking Policies of DeepSeek and ChatGPT GenAI Platforms.

29/01/25 by Mr Shane Currie | Charles Sturt Student

Summary

With the recent release of a Chinese developed GenAI known as DeepSeek there has been some media hype and speculation. Naturally, I am not someone who is interested in media speculation. Media outlets these days have become too political and tend to forgo the facts. Journalists these days remind me of politicians, instead of reporting on the facts they report on selective agendas.

So, I took it upon my self to do some apolitical research in to what information does GenAI collect form you? In this research I have explored the privacy policies of the DeepSeek and ChatGPT GenAI platforms. I have also provided a link to the DeepSeek-ai source code github repository in this article & I discuss how this source code can be analysed and bulit upon. The privacy policies that I looked in to are listed below:

Click here for the DeepSeek Privacy Policy, last updated (5/12/24)

Click here for the ChatGPT Privacy Policy, last updated (4/11/24) )

How GenAI monitors your IP address, and feeds you cookies

Now, I am a technician not a lawyer. I am more interested in the inter working mechanisms of network communication channels, such as the internet. Upon reading both policies, I can see that both ChatGPT and DeepSeek collect information such as your public IP address, and what operating system and browser that you use. Also I read both of their their cookie policies and if they share your information with third parties.

In laymans terms, your public IP address is like your computer networks home phone number, web applications need this phone number so the application knows who to call to direct the information that you request. Websites may also log your public IP address for security reasons, such as if an IP address is attempting malicious behaviour towards a website or if a user on a website is committing a criminal activity.

Both GenAI policies include a policy on Cookies. Cookies are required for most websites to work, such as websites that require you to log on. In laymans terms, when you log on to a website that website gives you a special cookie to track your logon session, without this cookie you will not be able to log in to a personalised web session. However, cookies can also have other uses, such as cookies can be used for tracking and advertising purposes. Want to know how online advertisers always seem to always know what products you are interested in? Well, that because of tracking cookies.

Key difference in the privacy policy of ChatGPT and DeepSeek

This led me to noticing the difference between privacy policies for ChatGPT and Deepseek. The privacy policy (4/11/24) for ChatGPT specifies that they do not sell or share personal data for contextual behavioural advertising and does not process personal data for advertising purposes. The privacy policy for DeepSeek (5/12/24) specifies that DeepSeek may share information collected through your use of their service with their advertising and analytics partners. The DeepSeek policy also receives information from advertising partners, this information can include hashed email addresses. Hashed email addresses can be used by the web application provider to determine if their users have services on other web platforms.

The second difference I noticed between the privacy acts of DeepSeek and ChatGPT is more of a legal one, not a technical one. So, I will make short mention of it as I am more interested in talking about technical stuff. ChatGPT has a different privacy policy for users in the European Economic Area, United Kingdom and Switzerland. DeepSeek does not seem to have a different privacy policy for countries outside of China.

To summarise, the key differences I have found between the ChatGPT and DeepSeek GenAI platforms is that ChatGPT shows more of a willingness to follow the privacy acts of countries outside of the United States and does not share user information with third party advertisers. DeepSeek, on the other hand, shows no indication to adhere with the privacy acts of countries outside of China and has indicated that they will share user information with third party advertisers.

How to check the ingredients of the advertising Cookies

It is possible to check what information DeepSeek is sharing with third party provides. This can be done via any major browser (Edge, Chrome, Firefox, Opera etc) via the developer tools option. Typically to access this, all you must do is right click on your browser and select the developer tools or inspect option.

Cookies are used for advertisement tracking data, these cookies can be typically found in the storage settings. You can also see requests been made to third party services via the network tab in developer tools. Advertising cookies are also saved on your local computer, stored in your browsers cookie storage.

At the end of the day, it is up to the user if they wish to use websites that make use of tracking cookies. Typically, websites that dont charge for their services make money via other methods and such as advertising. DeepSeek is not the only platform that does this. Youtube and Facebook also make use of tracking cookies for advertising.

Could DeepSeek recieve the same treatment as TikTok did in the United States?

With all the hype from the United States regarding the potential ban of TikTok I would be interested to see if a ban is proposed on DeepSeek for distributing their web application in the United States. GenAI, however, is a completely different service compared to social media. GenAI is being used as a service to provide information where social media provides a service for people to socialise with each other.

GenAI can, and does provide incorrect information. This is due to the datasets that the GenAI is trained on. In laymans terms, to train a GenAI, the GenAI is first provided a large set of datasets that includes books, websites and scientific journals and is trained using this data. These datasets can vary, depending on the GenAI service or the country where the GenAI service was trained in. For this reason, there is a potential that the United States may be concerned about a GenAI service that was trained in China.

Opportunity for Australia to develop our own GenAI

Regardless of what decision the United States makes regarding DeepSeek this still opens a new window of opportunity for countries like Australia to develop their own GenAI service. It is unlikely that the United States will ban a GenAI service trained in Australia from being distributed with the United States. This Australian GenAI service can be trained on datasets that includes Australian history and culture.

I would like to see a GenAI developed within Australia, however in the meantime it seems that most GenAI platforms are originating from the United States, Germany (Jina) & China. I will create a new article in the future on the potential benefits or risks that a GenAI service could provide Australia. However, in the meantime with the United States, Germany and China leading the way in GenAI development its all up to the user to decide which GenAI service to use.

The DeepSeek framework code is open souce on GitHub

It should be noted, that the source code for DeekSeek is open source. Now the website DeepSeek may use a different code, but the underlying framework of DeepSeek seems to be opensource code developed via python. In laymans terms, open source code is a program developed by a community of paid workers and/or volunteers. As this code is available online for anyone who wishes to access it, this code can be analysed and built upon.

The MIT licence on the github profile of the DeepSeek-ai code repository also gives permission to use, modify, publish or distribute the code. By running this code in a sandbox environment, developers can identify any remote API calls to third parties (in laymans those are network connections to another company) and can modify the code to restrict any potential API calls. They can develop on the framework to create a localised GenAI system, that can potentially operate without an internet connection.

Click here to view the open source code via the DeepSeek-AI repository on Github

Potential future implementations of a localised GenAI

The potential for a GenAI system, that can operate without an internet connection is limitless, we are talking about enabling machines the ability to have an open dialogue with a person. For example, your car can talk you through diagnosing a fault, childrens toys can be enabled with voice conversation abilities, you can have a conversation with an advertisement billboard, or a plaque at a historical site. You will have complete control over the datasets that the GenAI learns from, and the GenAi will be prevented form establishing an internet connection

I will explore the potential possibilities, both good and bad that GenAI can provide humanity. I will be looking out for more open source GenAI code on GitHub to serve as a framework for building an Australian GenAI model. Who knows, I might be able to build my own Jhonny 5 GenAI voice assistant, or a Major Chip Hazard voice assistant.

Thanks for reading my article, I would be interested to see the next advancements in generative artificial intelligence and how GenAI could become more integrated within our lives