OpenAI has unveiled two new open-weight AI reasoning models, the gpt-oss-120b and gpt-oss-20b. The company describes these models as "state-of-the-art" based on several benchmarks for open models. The release is a major step for OpenAI, as it's the first time since GPT-2 (released in 2019) that they are launching an 'open' language model.

Model specifications gpt-oss-20b can work on a consumer laptop with 16GB RAM The newly launched models come in two variants: the more powerful gpt-oss-120b, which can be run on a single NVIDIA GPU, and the lighter gpt-oss-20b that can work on a consumer laptop with 16GB RAM. This flexibility makes them accessible to a wider range of developers and users.

Connectivity potential New models can link to cloud AI for complex queries OpenAI has revealed that its open models have the capability to send complex queries to AI models in the cloud. This means if an open model can't perform a specific task, like image processing, developers can link it with one of OpenAI's more advanced closed models. This feature greatly enhances the usability and versatility of these new models.

Strategic change Competing with Chinese AI labs Despite having open-sourced AI models in its early days, OpenAI has mostly followed a proprietary, closed-source development strategy. However, CEO Sam Altman admitted in January that the company has been "on the wrong side of history" by not open-sourcing its technologies. The move to launch these new models is part of a broader strategy to compete with Chinese AI labs such as DeepSeek and Alibaba's Qwen.

Model performance OpenAI's new models outperformed other open-weight models on several benchmarks OpenAI has claimed that its open models outperform other open-weight AI models on various benchmarks. On Codeforces, a competitive coding test, gpt-oss-120b and gpt-oss-20b scored 2,622 and 2,516, respectively, beating DeepSeek's R1 but falling short of o3 and o4-mini. In Humanity's Last Exam, a challenging test of crowd-sourced questions across subjects, they scored 19% and 17.3% respectively.

Hallucination issue Higher hallucination rates than o3 and o4-mini It's worth noting that OpenAI's open models have a higher hallucination rate than its latest AI reasoning models, o3 and o4-mini. On PersonQA, an in-house benchmark for measuring a model's knowledge about people, gpt-oss-120b and gpt-oss-20b hallucinated 49% and 53% of the time, respectively. This is more than three times the hallucination rate of OpenAI's o1 model (16%) and higher than its o4-mini model (36%).