One week after Google updated its user privacy policy to allow data scraping from millions of users for artificial intelligence (AI) training purposes, the tech giant is now facing a class-action lawsuit.
Google, its parent company Alphabet, and Google’s AI subsidiary DeepMind, were accused in a July 11 filing in a federal court in San Francisco of misusing large amounts of personal information and copyrighted material without their consent to train its AI systems.
Eight plaintiffs, represented by Clarkson Law Firm, claimed to be representing “millions of class members,” such as internet users and copyright holders, who said that their privacy and property rights were violated by Google’s recent updates to its privacy policy.
Microsoft-backed OpenAI was already hit with a previous lawsuit in the same court in late June for alleged data-scraping and copyright theft.
The Open AI plaintiffs, who are also represented by Clarkson, accused Google of allegedly stealing “essentially every piece of data exchanged on the internet it could take” without credit, consent, or compensation.
The law firm has asked the court to allow the plaintiffs to remain anonymous in both cases, citing violent threats reportedly received by individuals filing similar lawsuits.
Earlier this week, comedian Sarah Silverman, along two other colleagues, filed a separate lawsuit against OpenAI’s creator, ChatGPT and Meta, for using their copyrighted work without permission for AI training.
Google Defends New Policy as Essential for Training AI
The move to harvest and harness online public data raised new privacy concerns and was bound to face legal challenges in the wake of the Open AI lawsuit.
The complaint pointed to a recent update to Google’s privacy policy that explicitly stated the company may use publicly accessible information to train its AI models and tools such as Bard.
On July 1, Google amended its privacy policy standards to allow it to scrape comments that posters publish online to improve their AI technology.
“Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public,” read the new Google policy.
“For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.”
Google told Verge on July 5 regarding the update that its policy “has long been transparent” about their practice of data scraping and “this latest update simply clarifies that newer services like Bard are also included.”
Halimah DeLaine Prado, Google’s general counsel, told Reuters in a statement that the plaintiffs’ claims were “baseless.”
“We’ve been clear for years that we use data from public sources—like information published to the open web and public datasets—to train the AI models behind services like Google Translate, responsibly and in line with our AI principles,” Mrs. DeLaine Prado wrote.
“American law supports using public information to create new beneficial uses, and we look forward to refuting these baseless claims,” she added.
The Epoch Times reached out to Alphabet for comment.
Plaintiffs Accuses Google of Violating Privacy and Stealing Published Works
In their opening statement, the plaintiffs accused Google of “harvesting data in secret” to build its AI products without consent.
“It has very recently come to light that Google has been secretly stealing everything ever created and shared on the internet by hundreds of millions of Americans,” said the complaint and that the company uses this data to train its AI products, like its chatbot, Bard.
The plaintiffs also allege that Google stole “virtually the entirety of our digital footprint,” including “creative and copywritten works” to build its AI products.
The lawsuit pointed out that Google’s decision not only violates use rights, but gives it an “unfair advantage” compared with its competitors, which lawfully obtain or purchase data to train AI.
“Google must understand, once and for all: it does not own the internet, it does not own our creative works, it does not own our expressions of our personhood, pictures of our families and children, or anything else simply because we share it online,” said Mr. Clarkson in a statement.
The plaintiffs are also particularly concerned apparent use of personal and sensitive data posted by children and one of the claimants in the lawsuit is a minor.
“Google … does not own our expressions of our personhood, pictures of our families and children, or anything else simply because we share it online,” Mr. Clarkson said in a statement.
Tim Giordano, one of the attorneys, told CNN that “publicly available” does not mean and allows it to say that the data is “free to use for any purpose.”
“Our personal information and our data is our property, and it’s valuable, and nobody has the right to just take it and use it for any purpose,” he said.
Mr. Giordano said that Google’s search engine could “serve up an attributed link to your work that can actually drive somebody to purchase it or engage with it.”
He said that data scraping to train AI tools could instead create “an alternative version of the work that radically alters the incentives for anybody to need to purchase the work,” he added.
Although most internet users are not bothered by their digital data being collected for search results or targeted advertising purposes, the same may not be true for AI bot training.
“People could not have imagined their information would be used this way,” continued Mr. Giordano.
Lawsuit Demands User Safeguards and Massive Damages
The claimants are seeking injunctive relief in the form of a temporary freeze on commercial access to and commercial development of Google’s generative AI tools like Bard.
Google could potentially owe upward of $5 billion in damages, according to the lawsuit.
The plaintiffs also requested a court order requiring Google to obtain users’ explicit permission first.
This ranges from allowing users to opt out of its “illicit data collection,” along with the ability to delete already existing data, as well as provide “fair compensation” to owners of the published data.
Mr. Clarkson told CNN that Google needs to “create an opportunity for folks to opt out” of having their data scraped for AI training, while still allowing them to use the internet for their everyday needs.
Reuters contributed to this report.