What actually changed with GPT-5

GPT-5 just landed with big promises... but is there real progress?

Here's all you need to know about GPT-5...

By the way, we (Towards AI) are super excited to share that we are now creating DAILY short-form videos that cover real progress from AI updates in under a minute, including GPT-5, obviously.

If you want a quick, no-fluff "anti-hype" way to stay in the loop, you can now follow along on your platform of choice: Instagram, YouTube, or TikTok. :)

The latest few ones are all related to OpenAI, where I covered GPT-5, GPT-5-mini, GPT-5-nano and their two recent open models, oss-20b and oss-120b! I've attached the videos at the end of this iteration!

p.s. If you'd like to support my work, please do share these short videos with your friends!

Alright, let's get ot it.

It’s rolling out to everyone in ChatGPT, including the free tier, so you need to know about it.

There are three variants: GPT-5, GPT-5 mini, and GPT-5 nano.

What’s different when you use it? The model feels steadier. Fewer wild answers. Better code. Stronger reasoning. And yes, OpenAI leaned into the “PhD-level expert” vibe. Sam Altman said "chatting with GPT-5 is like exchanging with a PhD-level expert in all fields". But that's to be confirmed with time...

Under the hood. Context is larger: up to ~256k tokens today (they mention 400k even), so it holds long threads, docs, and repos in working memory.

There’s a “GPT-5-thinking” mode that lets it reason even longer when needed instead of burning cycles on every reply, and less on easier queries. This is because the model was optimized through reinforcement learning from human feedback (RLHF) (e.g. on real use cases) to spend more tokens when the task is complex, and less when the task is simpler. Thanks to this "think-on-demand", GPT-5 trims wasted reasoning, up to 50-80% fewer tokens on easy tasks (that they reported).

Like other recent models we covered in the short videos last week, it was trained to use tools directly, so it's a better "agentic" model than previous ones.

You can send text, images, and even video in. It reasons better about what it sees, then replies in text (or uses tools). It cannot generate images as with GPT-4o!

Coding is where the leap is clearest. On public benchmarks like SWE-Bench Verified, GPT-5 leads prior OpenAI models and edges recent rivals. It's even beating o3 and o4!

Knowledge cutoff is September 30, 2024, so you should be careful with that, but with live web browsing to bridge the gaps.

They've also optimized part of the RLHF training to reduce hallucinations (stick to facts), which is supposed to make it better than previous models, but it's still not perfect.

And most important of all is this sentence OpenAI shared: GPT‑5 is less effusively agreeable, uses fewer unnecessary emojis.

Finally!

Now, what do mini and nano change? Cost and speed.

Prices (API, per 1M tokens) : GPT-5 ≈ $1.25 in / $10 out. Mini ≈ $0.25 in / $2 out. Nano ≈ $0.05 in / $0.40 out. That’s ~5× cheaper to generate with mini, ~25× with nano, versus standard GPT-5. Latency drops as you go smaller.

So, which should you pick?

Use GPT-5 when quality beats cost. Deep reasoning. Multi-step agent flows. High-stakes coding or analysis. If the answer must be right and well-explained, this is the one.

Use GPT-5 mini for most day-to-day work. Everyday dev tasks. RAG and retrieval workflows. Product copy. Analyst notebooks. You keep most of the brains for a fraction of the price, and you’ll probably ship faster. It's the 80:20 of the model series.

Use GPT-5 nano for scale (when you want to spam the heck out of it!). Bulk summarization. Tagging and labeling. FAQ replies. Quick triage. When throughput matters more than nuance, nano pays for itself.

A few practical tips:

With GPT-5 thinking, let the agent work. Instead of “tell me how,” try “book the venue,” “draft the deck,” or “compare these three vendors and paste a table.” It will browse, log in when you approve, and hand you artifacts to edit.

In the API, adjust the new verbosity and thinking parameters to align with your goals!

There are two ways to tackle a new model suite release:

Test the best model. If it works, try the smaller one, and so on.
Test with the smallest; if it breaks, use a bigger one, etc.

Both are good approaches. Just don't use GPT-5 Thinking (most expensive one) for every type of query and task!

A quick recap: Start with mini as your new baseline. Reach for GPT-5 when the stakes rise. Keep nano in your pocket for bulk.

Don't forget to follow our content for similar daily updates with quick entertaining videos wherever you prefer: Instagram, YouTube, or TikTok!

Here are some examples as promised:

GPT-5 release:

oss-20b: