LOADING...
ChatGPT copyright case: OpenAI offers 20M chats, NYT demands 120M
OpenAI argues that complying with the request will delay case

ChatGPT copyright case: OpenAI offers 20M chats, NYT demands 120M

Aug 06, 2025
10:24 am

What's the story

OpenAI has offered 20 million user chats in a high-profile lawsuit with The New York Times (NYT). The AI company proposed this as a compromise specifically to limit the number of logs accessed, as the NYT wanted access to a much larger number of logs. This comes after a court ruling that granted broader access to the logs. However, the NYT has rejected OpenAI's offer and is now demanding access to individual log files of 120 million ChatGPT consumer conversations.

Legal strategy

OpenAI argues that complying with the request will delay case

In response to the NYT's request, OpenAI has argued that complying with it would "increase the scope of user privacy concerns" and delay the case by months. The company also noted that each log must be decompressed and scrubbed of identifying information before being made searchable. This process is "highly complex," requiring significant time, computational resources, and engineering work from OpenAI engineers who design, debug, operate, and monitor these systems.

Compromise attempt

'Relevant sample size' is enough to determine usage stats

OpenAI has proposed a compromise, suggesting that news organizations shouldn't have to search all ChatGPT logs. The company cited computer science researcher Taylor Berg-Kirkpatrick as the "only expert" who has weighed in on a statistically relevant, appropriate sample size. Berg-Kirkpatrick suggested that 20 million logs would be enough to determine how often ChatGPT users may be using the chatbot to reproduce articles and bypass paywalls of news sites.

Dispute escalation

NYT reject OpenAI's compromise, demand access to all logs

The NYT and other plaintiffs have rejected OpenAI's compromise. They argue that it's necessary to search through 120 million ChatGPT users' conversations to prove not just that infringing outputs may be happening frequently, but also document any patterns showing spikes in infringement. The news plaintiffs want a full-scale analysis on every single month during the relevant 23-month time period so they can evaluate how the product has changed over time.