
Google's main privacy regulator in the European Union has launched an investigation into whether it complied with EU data protection laws when it used personal data to train AI.
Specifically, it is looking into whether the tech giant should conduct a data protection impact assessment (DPIA) to proactively consider the risks of AI technology to individuals’ rights and freedoms due to the information used to train the models.
Generative AI tools are notorious for creating plausible-sounding falsehoods. This tendency, combined with their ability to provide personal information on demand, creates a lot of legal risk for creators. Ireland’s Data Protection Commission (DPC), which oversees Google’s compliance with the country’s General Data Protection Regulation (GDPR), has the power to fine Alphabet (Google’s parent company) up to 4% of its annual global revenue for confirmed violations.
Google has developed several generative AI tools, including a full suite of general-purpose large-scale language models (LLMs) branded as Gemini (formerly Bard). They use this technology to power AI chatbots, including those that enhance web searches. The foundation of these consumer-facing AI tools is a Google LLM called PaLM2, which was released at last year’s I/O developer conference.
The Irish Data Protection Commissioner has said it is investigating how Google developed these underlying AI models under section 110 of the Irish Data Protection Act 2018, which transposes the GDPR into national law.
Training GenAI models typically requires massive amounts of data, and the types of information LLM authors collect, how they obtain it, and where they obtain it are increasingly being scrutinized in light of various legal issues, including copyright and privacy.
In the latter case, information used as AI training feed containing personal data of EU citizens is subject to the block’s data protection rules, whether scraped from the public internet or obtained directly from users. This is why many LLMs have already faced privacy compliance questions and some GDPR enforcement, including OpenAI, which created GPT (and ChatGPT), and Meta, which developed the Llama AI model.
Elon Musk-owned X has also faced GDPR complaints and anger over its use of people’s data to train AI at the DPC. This has led to court proceedings and X promising to limit its data processing, but no sanctions. X could still face GDPR fines if the DPC determines that its processing of user data to train its AI tool Grok violated regulations.
The DPC’s DPIA investigation into Google’s GenAI is the latest regulatory action in this area.
“The legal investigation relates to the question of whether Google complied with its obligation under Article 35 of the General Data Protection Regulation (Data Protection Impact Assessment) to conduct an assessment prior to engaging in the processing of personal data of EU/EEA data subjects in connection with the development of a foundational AI model, Pathways Language Model 2 (PaLM 2),” the DPC said in a press release.
DPIAs “can be crucial to ensuring that individuals’ fundamental rights and freedoms are appropriately taken into account and protected when the processing of personal data is likely to result in high risks”.
“This legal inquiry forms part of the DPC’s broader efforts to work with EU/EEA (European Economic Area) peer regulators to regulate the processing of personal data of EU/EEA data subjects when developing AI models and systems,” the DPC added, noting that the bloc of GDPR Enforcer Network members is continuing to work to reach some kind of consensus on how best to apply privacy laws to GenAI tools.
Google did not respond to questions about the data sources used to train the GenAI tool, but spokesperson Jay Stoll emailed a statement saying Google “takes its obligations under GDPR seriously and will work constructively with the DPC to answer their questions.”









