China unveils Kimi K2, outshining GPT-4 in AI benchmarking

Technology Jul 15, 2025

Moonshot AI, a Chinese startup, has just dropped Kimi K2—a massive open-source AI model with one trillion parameters.
It's built to handle everything from coding on its own to running digital assistants.
You can pick between the Base version for research or the Instruct version for chatbots and apps.

Outperforms GPT-4.1 on several benchmarks

Kimi K2 runs on 32 billion of its trillion parameters, balancing power and efficiency.
It scored 65.8% on SWE-bench Verified and outperformed GPT-4.1 on LiveCodeBench (53.7% vs 44.7%).
In math tests, it hit a standout 97.4% accuracy—better than GPT-4.1's 92.4%.
Basically, it beats most models out there at reasoning and complex tasks.

Can write and run code

Thanks to its "mixture-of-experts" design, Kimi K2 doesn't just spit out text—it can actually write and run code by itself, which keeps costs down too.
Plus, as an open-source model, it shows Moonshot AI is serious about making advanced AI accessible to all.