China's Moonshot AI claims its model beats GPT-5, Claude Sonnet
What's the story
Chinese artificial intelligence (AI) start-up Moonshot AI has launched a new version of its open-source Kimi K2 model. The updated variant, dubbed Kimi K2 Thinking, has outperformed OpenAI's GPT-5 and Anthropic's Claude Sonnet 4.5 in several benchmarks. The Beijing-based company announced the launch on Thursday, making the model available through Kimi.com and its application programming interface (API).
Model performance
'Turning point in AI'
The researchers behind the Kimi model announced its new version on GitHub, highlighting that it has set "new records across benchmarks that assess reasoning, coding and agent capabilities." This achievement marks a major milestone in the AI industry as Chinese firms have closed the performance gap between their open-source models and those developed by US counterparts. Deedy Das, a partner at Menlo Ventures, called this a "turning point in AI" and a "seminal moment."
Benchmark results
Kimi K2 Thinking scored 44.9% on Humanity's Last Exam
Kimi K2 Thinking has outperformed closed-source models such as GPT-5 and Claude Sonnet 4.5 on Humanity's Last Exam, a large language model (LLM) benchmark with 2,500 questions across various subjects. The new model scored an impressive 44.9% on the test, showcasing its advanced capabilities in reasoning and coding tasks.