NewsBytes
    Hindi Tamil Telugu
    More
    In the news
    Narendra Modi
    Amit Shah
    Box Office Collection
    Bharatiya Janata Party (BJP)
    OTT releases
    Hindi Tamil Telugu
    NewsBytes
    User Placeholder

    Hi,

    Logout

    India
    Business
    World
    Politics
    Sports
    Technology
    Entertainment
    Auto
    Lifestyle
    Inspirational
    Career
    Bengaluru
    Delhi
    Mumbai

    Download Android App

    Follow us on
    • Facebook
    • Twitter
    • Linkedin
    Home / News / Technology News / Apple study reveals AI's struggle with complex reasoning
    Summarize
    Next Article
    Apple study reveals AI's struggle with complex reasoning
    Thinking AI models crumble under complex problem pressure

    Apple study reveals AI's struggle with complex reasoning

    By Akash Pandey
    Jun 08, 2025
    11:43 am

    What's the story

    Ahead of the much-anticipated Worldwide Developers Conference (WWDC), Apple has published a study titled "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity."

    The research tested several 'reasoning' artificial intelligence (AI) models, including Anthropic's Claude, OpenAI's models, DeepSeek R1, and Google's Thinking models.The goal was to see how well these systems could replicate human reasoning in complex problem-solving scenarios.

    Evaluation critique

    The study's focus and methodology

    The study takes issue with the standard practice of evaluating Large Reasoning Models (LRMs) using established mathematical and coding benchmarks.

    It argues these methods suffer from data contamination and fail to provide insights into reasoning trace structure and quality.

    Instead, it proposes a controlled experimental testbed using algorithmic puzzle environments as a more effective way to assess AI reasoning capabilities.

    Research results

    AI models struggle with complex problem-solving scenarios

    The study found that state-of-the-art LRMs like o3-mini, DeepSeek-R1, and Claude-3.7-Sonnet-Thinking still struggle to develop generalizable problem-solving capabilities.

    Their accuracy collapses to zero beyond certain complexities across different environments.

    This stark finding highlights a major limitation of current AI systems: the inability to handle complex problems consistently.

    Future prospects

    Need to evolve AI evaluation methods, says study

    The research highlights the need to evolve AI evaluation methods and understand the fundamental benefits and limitations of LRMs.

    It raises critical questions about these models' capabilities for generalizable reasoning and their performance scaling with increasing problem complexity.

    These findings could inform Apple's future AI strategies, possibly focusing on specific use cases where current AI methods are not reliable.

    Scaling insights

    Models actually reduce their reasoning effort as complexity increases

    The study also found a counter-intuitive scaling limit in the reasoning effort of these models, which is measured by inference token usage during the "thinking" phase.

    Initially, these models spend more tokens, but as complexity increases, they actually reduce their reasoning effort closer to an inevitable accuracy collapse.

    This observation provides further insight into the limitations of current AI systems in handling complex problems.

    Facebook
    Whatsapp
    Twitter
    Linkedin
    Related News
    Latest
    Apple
    Artificial Intelligence and Machine Learning

    Latest

    Apple study reveals AI's struggle with complex reasoning Apple
    Rana Daggubati explains why Telugu cinema is dominating other industries Andhra Pradesh
    Rahul Gandhi slams EC's 'unsigned, evasive' reply Bihar
    Jared Leto accused of sexual misconduct by 9 women Jared Leto

    Apple

    Amazon's drones will now deliver your iPhone in an hour Amazon
    Here's why Microsoft's Xbox mobile store is not live yet Microsoft
    Apple iPhone 7 Plus is now vintage: What it means Technology
    S25 Edge, Samsung's thinnest flagship, is now made in India Samsung

    Artificial Intelligence and Machine Learning

    Google AI insists it's 2024, not 2025 Google
    Modi government selects 3 more start-ups to develop AI models Ashwini Vaishnaw
    Google to invest $7B in US for cloud, AI growth Google
    Meta wants to use AI for assessing privacy violations: Report Meta
    Indian Premier League (IPL) Celebrity Hollywood Bollywood UEFA Champions League Tennis Football Smartphones Cryptocurrency Upcoming Movies Premier League Cricket News Latest automobiles Latest Cars Upcoming Cars Latest Bikes Upcoming Tablets
    About Us Privacy Policy Terms & Conditions Contact Us Ethical Conduct Grievance Redressal News News Archive Topics Archive Download DevBytes Find Cricket Statistics
    Follow us on
    Facebook Twitter Linkedin
    All rights reserved © NewsBytes 2025