GPT-5 and Claude Opus 4.1 ace job-level exam

Technology Sep 25, 2025

OpenAI just dropped a fresh benchmark called GDPval, which tests how well AI models stack up against real professionals across nine industries and 44 jobs—from engineers to nurses.
The results? GPT-5 matched or beat expert-level work 40.6% of the time, while Anthropic's Claude Opus 4.1 scored even higher at 49%, partly because it tends to make pleasing graphics, according to OpenAI.

OpenAI's vision for AI in the workplace

OpenAI plans to expand GDPval beyond just report writing.
Dr. Aaron Chatterji, OpenAI's Chief Economist, said these advances mean pros can hand off routine tasks to AI and focus on more meaningful work.
Tejal Patwardhan from OpenAI is also feeling positive about how quickly AI is improving compared to earlier years—big leaps in a short time!