Summarize

This OpenAI model 'thinks' in Chinese. We don't know why

By Dwaipayan Roy

Jan 15, 2025

01:13 pm

What's the story

OpenAI's latest artificial intelligence (AI) model, o1, has been seen to "think" in different languages, including Chinese and Persian. The weird behavior comes even when the AI is given a problem in English. The model employs a series of reasoning steps to reach an answer. However, some of these steps are taken in a different language before the final conclusion is reached in English.

User observations

User experiences and expert theories on o1's behavior

Users have reported cases of o1 "randomly starting to think in Chinese halfway through," even when no part of the conversation involved Chinese. OpenAI has not yet explained this weird behavior. AI experts have offered a few theories, with some suggesting that reasoning models like o1 are trained on datasets with a lot of Chinese characters.

Data influence

Influence of Chinese data labeling services

Ted Xiao, a researcher at Google DeepMind, suggested that companies like OpenAI utilize third-party Chinese data labeling services. He thinks o1's shift to Chinese could be an example of "Chinese linguistic influence on reasoning." Labels or tags are used during the training process to help models understand and interpret data. However, the theory has been met with skepticism as o1 has also been seen switching to other languages like Hindi and Thai.

Efficiency theory

Language selection may be efficiency-driven

Some experts argue reasoning models like o1 could use languages they deem most efficient for a particular task. Matthew Guzdial, an AI researcher and Assistant Professor at the University of Alberta, said, "The model doesn't know what language is, or that languages are different." To these models, it's all just text. This view indicates the choice of language could be more about computational efficiency than any bias in the model.

Token usage

AI models use tokens, not words

AI models such as o1 don't deal with words directly but with tokens. These tokens can be whole words, syllables, or even characters in a word. This tokenization process can also inject biases into how the model works. Tiezhen Wang from Hugging Face agreed with Guzdial's opinion that language inconsistencies in reasoning models could stem from associations made during training.

Task specificity

Language selection may be task-specific

Wang further explained, "By embracing every linguistic nuance, we expand the model's worldview and allow it to learn from the full spectrum of human knowledge." He cited an example of preferring to do math in Chinese because of its efficiency but switching to English for topics like unconscious bias. This indicates that language selection by AI models could be task-specific and training-phase associations.