Microsoft and OpenAI have been sued by sixteen individuals who claim that the companies used personal data without permission to train their Artificial Intelligence (AI) models.
The 157 page lawsuit (via The Register) was filed by the individuals through the Clarkson Law Firm in federal court in San Francisco, California on 28th June. The lawsuit alleges that Microsoft and OpenAI used data to train ChatGPT without consent, adequate notice, or payment for the said data.
Despite established protocols for the purchase and use of personal information, Defendants took a different approach: theft. They systematically scraped 300 billion words from the internet, ‘books, articles, websites and posts – including personal information obtained without consent.’ OpenAI did so in secret, and without registering as a data broker as it was required to do under applicable law.
The lawsuit further talks about privacy of individuals as it notes that the data used by OpenAI contained information about people’s beliefs, reading habits, hobbies, transaction and location data, chat logs, and more.
While the reams of personal information that Defendants collect on Users can be used to provide personalized and targeted responses, it can also be used for exceedingly nefarious purposes, such as tracking, surveillance, and crime. For example, if ChatGPT has access to a User’s browsing history, search queries, and geolocation, and combines this information with what Defendant OpenAI has secretly scraped from the internet, Defendants could build a detailed profile of Users’ behavior patterns, including but not limited to where they go, what they do, with whom they interact, and what their interests and habits are. This level of surveillance and monitoring raises vital ethical and legal questions about privacy, consent, and the use of personal data. It is crucial for users to be aware of how their data is being collected and used, and to have control over how their information is shared and used by advertisers and other entities.
Not only that, but the lawsuit also targeted OpenAI’s approach towards hiding Personal Identifiable Information (PII). Earlier this year, The Register published a report shedding light on OpenAI’s plan to prevent the PII leak while using ChatGPT. According to the report, OpenAI had just put in a content filter that would block the AI from spitting private information like phone numbers and credit card information.
With respect to personally identifiable information, Defendants fail sufficiently to filter it out of the training models, putting millions at risk of having that information disclosed on prompt or otherwise to strangers around the world.
Lastly, the lawsuit also alleges that Microsoft and OpenAI violated the Electronic Privacy Communications Act by obtaining and using confidential information illegally. In addition, the plaintiffs also alleged that Microsoft had violated the Computer Fraud and Abuse Act by intercepting communication between third party services/ChatGPT integrations.
The lawsuit in general is full of citations from researchers, academics, journalists and others who have raised alarms in the past regarding the use of neural networks and AI. However, the filing is light on how the use of information and the instances of harm it has caused is worth $3 Billion in damages.
This is not the first time Microsoft has come under fire for misusing data or using it without the proper consent. Last month, Twitter sent a notice to Microsoft alleging that company had used Twitter’s data without consent. OpenAI, on the other hand, had its own fair share of problems. In March, the company reported a breach that leaked partial payment information of ChatGPT users. Earlier this month, account data of over 100,000 ChatGPT users was leaked and sold on the dark web.