LOADING...
Summarize
GPT‑5.1-Codex-Max: How OpenAI's new model changes the way developers code
GPT‑5.1-Codex-Max is OpenAI's most advanced coding model yet

GPT‑5.1-Codex-Max: How OpenAI's new model changes the way developers code

Nov 23, 2025
05:27 pm

What's the story

OpenAI's new agentic coding model, GPT-5.1-Codex-Max, is designed for long-running and complex software development tasks. The sophisticated coding model is said to be "faster, more intelligent, and more token-efficient at every stage of the development cycle." The launch comes shortly after Google's introduction of Antigravity, an AI platform focused on developers.

Enhanced features

Capabilities and performance

The new model, which is based on the GPT-5.1 architecture, was trained using real-world software engineering tasks like creating pull requests, code reviews, website building, and answering technical questions. These capabilities make it more effective than OpenAI's previous coding models in frontier coding evaluations. The company has also said that this is the first version of Codex that works seamlessly on Windows.

Autonomous operation

Autonomous working and accessibility

OpenAI has claimed that the new model can work independently for long hours. In internal tests, Codex-Max kept improving its code, fixing errors, and ultimately delivering a successful result even after more than 24 hours of continuous work. The model is available to users on ChatGPT Plus, Pro, Business, Edu, and Enterprise plans. For developers using Codex CLI via API key, access will be available once API support rolls out.

Performance metrics

Accuracy and efficiency

OpenAI has also claimed that the new model is significantly better than its predecessors. In one coding test (SWE-Lancer), it got 79.9% of answers right, compared to 66.3% with the older GPT-5.1-Codex. In another test (SWE-bench Verified), the newer model solved more problems with higher accuracy at the same reasoning level while using about 30% fewer "thinking tokens." This means it works faster and more efficiently, translating to lower costs for developers.