Page Loader
New Google tool lets publishers opt out of AI training
This new tool aims to tackle concerns about consent and ethical data collection

New Google tool lets publishers opt out of AI training

Sep 29, 2023
10:32 am

What's the story

Google has launched a new tool called Google-Extended, which allows website publishers to opt out of having their data used for training the company's AI models, while still being accessible through Google Search. This tool gives publishers more control over whether their sites contribute to the improvement of Bard and Vertex AI generative APIs. Accessible via "robots.txt," the file that guides web crawlers on site accessibility, Google-Extended empowers publishers to determine how their data is utilized in training AI models.

Details

Protecting content from AI training without losing search visibility

With Google-Extended, website owners can easily stop their content from being used as training data for Google's AI models without losing search visibility. All they need to do is disallow "User-Agent: Google-Extended" in their site's robots.txt file. This move addresses concerns raised by web publishers who want more choice and control over how their content is used for emerging generative AI applications.

What Next?

Google's ethical approach to AI development

Google is committed to developing its AI models ethically and inclusively, understanding that AI training use cases differ significantly from web indexing. Danielle Romain, the company's VP of Trust, emphasizes the importance of consent in this process and encourages web publishers to make a positive choice in contributing to the improvement of Bard and Vertex AI generative APIs. This new tool aims to tackle concerns about consent and ethical data collection.

Insights

Medium blocks crawlers, seeks better solution

In related news, Medium announced that it will block web crawlers like Google-Extended until a more granular solution is found. The platform joins other websites in seeking better control over how their content is used for AI training purposes. As AI applications continue to grow, Google plans to explore additional machine-readable approaches to choice and control for web publishers, with more details expected to be shared soon.