OpenAI releases open-weight reasoning models that can run on laptops

By Mudit Dube

Aug 06, 2025

09:20 am

What's the story

OpenAI has unveiled two new open-weight AI reasoning models, the gpt-oss-120b and gpt-oss-20b. The company describes these models as "state-of-the-art" based on several benchmarks for open models. The release is a major step for OpenAI, as it's the first time since GPT-2 (released in 2019) that they are launching an 'open' language model.

Model specifications

gpt-oss-20b can work on a consumer laptop with 16GB RAM

The newly launched models come in two variants: the more powerful gpt-oss-120b, which can be run on a single NVIDIA GPU, and the lighter gpt-oss-20b that can work on a consumer laptop with 16GB RAM. This flexibility makes them accessible to a wider range of developers and users.

Connectivity potential

New models can link to cloud AI for complex queries

OpenAI has revealed that its open models have the capability to send complex queries to AI models in the cloud. This means if an open model can't perform a specific task, like image processing, developers can link it with one of OpenAI's more advanced closed models. This feature greatly enhances the usability and versatility of these new models.

Strategic change

Competing with Chinese AI labs

Despite having open-sourced AI models in its early days, OpenAI has mostly followed a proprietary, closed-source development strategy. However, CEO Sam Altman admitted in January that the company has been "on the wrong side of history" by not open-sourcing its technologies. The move to launch these new models is part of a broader strategy to compete with Chinese AI labs such as DeepSeek and Alibaba's Qwen.

Model performance

OpenAI's new models outperformed other open-weight models on several benchmarks

OpenAI has claimed that its open models outperform other open-weight AI models on various benchmarks. On Codeforces, a competitive coding test, gpt-oss-120b and gpt-oss-20b scored 2,622 and 2,516, respectively, beating DeepSeek's R1 but falling short of o3 and o4-mini. In Humanity's Last Exam, a challenging test of crowd-sourced questions across subjects, they scored 19% and 17.3% respectively.

Hallucination issue

Higher hallucination rates than o3 and o4-mini

It's worth noting that OpenAI's open models have a higher hallucination rate than its latest AI reasoning models, o3 and o4-mini. On PersonQA, an in-house benchmark for measuring a model's knowledge about people, gpt-oss-120b and gpt-oss-20b hallucinated 49% and 53% of the time, respectively. This is more than three times the hallucination rate of OpenAI's o1 model (16%) and higher than its o4-mini model (36%).

Training process

How were the new open-weight models trained?

OpenAI has said that its open models were trained using similar processes as its proprietary ones. Each model uses mixture-of-experts (MoE) to activate fewer parameters for any given question, making it run more efficiently. The company also says that its open model was trained using high-compute reinforcement learning (RL), a post-training process to teach AI models right from wrong in simulated environments using large clusters of NVIDIA GPUs.