For nearly three years, ChatGPT has dominated the AI conversation. Powering everything from homework help and marketing copy to debugging code and brainstorming business ideas. It’s become the face of generative AI, with hundreds of millions of monthly users and integrations across education, software, and customer service.
But the AI race has accelerated. Rivals like Claude, Gemini, and Grok are closing the gap, pushing OpenAI to deliver something more ambitious.
Enter GPT-5, a model built not just to answer questions, but to act like a true digital assistant. Its built-in “router” dynamically decides whether to fire off a quick reply or engage in deeper, multi-step reasoning, making it equally at home in casual conversation, complex coding tasks, and long-form research.
It’s a step toward AI that can think ahead, plan, and execute, not just respond.
So, how powerful is GPT-5 really, and what do independent tests say? Read on to see how it stacks up in coding, medical accuracy, and real-world problem-solving.
Can GPT-5 really think like a digital assistant?
I've had access to GPT-5 since July 21st.
— Matt Shumer (@mattshumer_) August 7, 2025
Since then, I've used it as my daily-driver, pushing it to its limits.
Here's my review of GPT-5 (note: full, interactive review w/ artifacts is linked in the next tweet):
—
TL;DR:
– GPT-5 is clearly a big leap from previous models.…
According to the GPT-5 System Card from OpenAI, the model uses an internal real-time “router” that decides on the fly whether to respond with a quick, lightweight answer or to engage in more agent-like, multi-step reasoning. This design removes the need for users to manually choose between “fast” and “deep” modes; the system evaluates the complexity, context, and intent of each query automatically.
CEO Sam Altman has called GPT-5 “the best model in the world” and compared its leap over GPT-4 to the shift from standard screens to Apple’s Retina displays, according to an interview with Wired. However, Altman stresses that GPT-5 is not artificial general intelligence.
OpenAI’s study also explains that the router was trained using real-world usage data, including how often users switched models, which responses they preferred, and patterns of successful problem-solving. By leveraging this feedback loop, GPT-5 can better match the reasoning depth to the task at hand.
As detailed in both the OpenAI System Card and reporting from The Guardian, the model cannot learn or improve continuously after deployment, something many researchers consider a critical milestone for true AGI.
Before you dive in, for extra context on what makes GPT-5 different and why it might surprise you, watch this short video here, then come back to explore the full breakdown and benchmarks in this article.
How does GPT-5 perform in independent benchmarks?
Coding performance SWE-bench and Aider Polyglot
In the SWE-bench Verified benchmark, a real-world coding test that measures how effectively an AI can resolve actual GitHub issues by executing the code and confirming the fix, GPT-5 achieved a 74.9% success rate, compared to GPT-4’s 52%, according to the FinalRound AI performance analysis.
In the Aider Polyglot coding task suite, which measures multi-language coding proficiency and bug fixing, GPT-5 achieved 88%, demonstrating its marked improvement in automated debugging and application generation.
Medical accuracy HealthBench
When tested on the HealthBench Hard Hallucinations benchmark, which measures factual accuracy in healthcare responses, GPT-5 running in “thinking mode” hallucinated only 1.6% of the time, compared to GPT-4o’s 12.9%. This reduction represents a dramatic cut in potentially harmful misinformation, especially in high-stakes medical scenarios.
Independent verification and criticism
An analysis by Vellum AI confirmed GPT-5’s strong real-world coding performance and its tendency toward fewer hallucinations than earlier models. However, researchers in a Reddit AI benchmarking discussion have warned that the low hallucination rate may partly reflect GPT-5’s increased caution, sometimes refusing to answer ambiguous or difficult prompts rather than genuinely knowing the answer.
What’s new for users and developers?

GPT-5 isn’t just faster and smarter, it’s easier to use, more customizable, and more flexible for developers. Here’s what’s changed:
Unified Model Access
With GPT-5’s intelligent real-time router, you no longer have to decide between GPT-4, GPT-4o, or other variants. The model automatically selects the most suitable reasoning mode for your request, based on its complexity, topic, and intent.
Whether you need a quick answer or in-depth, multi-step reasoning, GPT-5 selects the right path without requiring extra clicks.
Built-In Personalities
Want your assistant to be witty, robotic, empathetic, or geeky? GPT-5 now offers selectable conversational styles.
Expanded Subscription Tiers
OpenAI has overhauled its pricing and access levels:
- Free tier, GPT-5 Basic with standard speed and daily limits
- Plus tier, Higher usage caps, and faster responses
- Pro tier, $200/month for GPT-5 Pro with unlimited “thinking mode” for heavy workloads.
The Pro plan targets power users, researchers, and businesses that need sustained, high-capacity reasoning.
Developer Controls
The API now includes GPT-5, GPT-5 Thinking, and GPT-5 Pro options, letting developers balance speed, cost, and capability. New parameters, such as reasoning effort and verbosity, detailed in the OpenAI developer docs, provide fine-grained control over how deeply the model thinks and how much detail it returns.
How is the AI community reacting?
The AI community’s reaction to GPT-5 has been a mix of enthusiasm and pushback. Developers in particular have praised the model’s upgrades, noting that it feels more context-aware, responsive, and easier to guide than its predecessors.
Many also highlighted its improved ability to maintain nuanced conversations, handle longer reasoning chains, and produce cleaner, more reliable code. But not everyone is happy with the shift.
As reported by The Verge, a number of longtime ChatGPT users were dismayed by the removal of GPT-4o, a version they described as warmer, more creative, and more “human” in tone. This change sparked widespread debate across forums and social media, with some users canceling their subscriptions in protest.
The backlash was strong enough that OpenAI reinstated GPT-4o for Plus subscribers just a day after GPT-5’s debut, underscoring how much personality and feel still matter in AI adoption, not just raw performance gains.
Why GPT-5 isn’t AGI but is still a big step

Despite its advances, GPT-5 is still a narrow AI; it cannot adapt continuously, form new concepts beyond training data, or surpass humans in all cognitive tasks. What it can do is:
- GPT-5 represents a major leap in generative AI, offering improved coding performance, factual accuracy, and reduced hallucinations compared to GPT-4.
- New “router” architecture allows the model to dynamically choose between quick answers and deeper, multi-step reasoning without user intervention.
- Independent benchmarks, such as SWE-bench Verified and HealthBench, show significant gains in reliability, especially in high-stakes domains like healthcare and software development.
- Developer and user features like unified model access, built-in personalities, and fine-tuned reasoning controls make GPT-5 more versatile across use cases.
- Community reactions are divided, while many welcome the improved intelligence, others miss the personality and warmth of GPT-4o, prompting OpenAI to restore it alongside GPT-5.
- Not yet AGI, GPT-5 still cannot learn after deployment or match human general intelligence, but it marks an important step toward more autonomous, agent-like AI.
GPT-5 is the most capable public AI model to date, with benchmarked gains in coding, factual accuracy, and reliability. It’s not AGI, but it’s the clearest step yet toward agent-like AI that can handle complex, end-to-end tasks without hand-holding.
Recommended:
- AI Writes Code as Microsoft Lays Off Devs
- Sam Altman warns that your ChatGPT logs could land in court
- Musk Calls Grok a Fail After MAGA Answer
This story is made with AI assistance and human editing.
This is exclusive content for our subscribers.
Enter your email address to instantly unlock ALL of the content 100% FREE forever and join our growing community of smart home enthusiasts.
No spam, Unsubscribe at any time.




Lucky you! This thread is empty,
which means you've got dibs on the first comment.
Go for it!