
OpenAI releases open-weight reasoning models that can run on laptops
What's the story
OpenAI has unveiled two new open-weight AI reasoning models, the gpt-oss-120b and gpt-oss-20b. The company describes these models as "state-of-the-art" based on several benchmarks for open models. The release is a major step for OpenAI, as it's the first time since GPT-2 (released in 2019) that they are launching an 'open' language model.
Model specifications
gpt-oss-20b can work on a consumer laptop with 16GB RAM
The newly launched models come in two variants: the more powerful gpt-oss-120b, which can be run on a single NVIDIA GPU, and the lighter gpt-oss-20b that can work on a consumer laptop with 16GB RAM. This flexibility makes them accessible to a wider range of developers and users.
Connectivity potential
New models can link to cloud AI for complex queries
OpenAI has revealed that its open models have the capability to send complex queries to AI models in the cloud. This means if an open model can't perform a specific task, like image processing, developers can link it with one of OpenAI's more advanced closed models. This feature greatly enhances the usability and versatility of these new models.
Strategic change
Competing with Chinese AI labs
Despite having open-sourced AI models in its early days, OpenAI has mostly followed a proprietary, closed-source development strategy. However, CEO Sam Altman admitted in January that the company has been "on the wrong side of history" by not open-sourcing its technologies. The move to launch these new models is part of a broader strategy to compete with Chinese AI labs such as DeepSeek and Alibaba's Qwen.
Model performance
OpenAI's new models outperformed other open-weight models on several benchmarks
OpenAI has claimed that its open models outperform other open-weight AI models on various benchmarks. On Codeforces, a competitive coding test, gpt-oss-120b and gpt-oss-20b scored 2,622 and 2,516, respectively, beating DeepSeek's R1 but falling short of o3 and o4-mini. In Humanity's Last Exam, a challenging test of crowd-sourced questions across subjects, they scored 19% and 17.3% respectively.
Hallucination issue
Higher hallucination rates than o3 and o4-mini
It's worth noting that OpenAI's open models have a higher hallucination rate than its latest AI reasoning models, o3 and o4-mini. On PersonQA, an in-house benchmark for measuring a model's knowledge about people, gpt-oss-120b and gpt-oss-20b hallucinated 49% and 53% of the time, respectively. This is more than three times the hallucination rate of OpenAI's o1 model (16%) and higher than its o4-mini model (36%).
Training process
How were the new open-weight models trained?
OpenAI has said that its open models were trained using similar processes as its proprietary ones. Each model uses mixture-of-experts (MoE) to activate fewer parameters for any given question, making it run more efficiently. The company also says that its open model was trained using high-compute reinforcement learning (RL), a post-training process to teach AI models right from wrong in simulated environments using large clusters of NVIDIA GPUs.