Coding Changed Forever

Key takeaways

AI can write a lot of code, but that does not remove the need for engineering judgment.
Good agentic coding depends on context files, tests, skills, code review, and clear task boundaries.
The risky version of vibe coding is shipping whatever the agent produced because it looked like it worked.

A lot of people come to me saying that they vibe code their whole products now. Even our students. And I’m like, yes, I know, it f***ing sucks!

But that’s because they do it wrong. You can actually use agents to code much more efficiently without breaking stuff.

The creator of Claude Code just admitted that for the last few months, 100% of Claude Code was written by Claude Code itself.

And on the team that builds the tool, roughly 95% of everything they ship is written using Claude Code.

The tool builds itself. Humans review at the end. That is the state of coding in 2026.

The guy saying that is Boris Cherny. He runs Claude Code at Anthropic. And the first time I read that quote, I had to read it twice. Three years ago, a sentence like that would have sounded insane. Even a year ago, it would have still sounded insane. Today, nobody even flinched.

And if you’re an engineer, you already feel it. The thing we used to call coding is now popular as vibe coding.

So how did we get here? How did we go from opening Stack Overflow to solve one bug, to agents writing entire features on their own? That didn’t happen overnight, but in slow progress in about 5 years. And today, I want to walk you through the full story. The timeline, the tools, the tweets, the numbers, and the people, or even agents, who actually built this… Here’s the story behind “vibe coding”.

I’m Louis-François, CTO and co-founder at Towards AI, where we turn engineers into AI engineers who build and ship AI products. Let’s get into it. (You can also watch the video version here if interested! https://youtu.be/ShFn3MG0h8s)

One thing before we start. Keep Boris’ quote in your head, because I’m going to end this article with a controlled study that says developers using AI are actually slower, even though they feel faster. Same world. Same month even, and both are true. How you use it is the game changer, and to do so properly, you might want to consider subscribing to the channel to see the upcoming AI engineering videos ;) I’m going all in on YouTube offering tons of free videos this year with one goal: finally reach 100K subscribers after 6 years of work. Help me reach it this year and click that subscribe button!

Okay, let’s start quickly rewinding from 2015 to 2021, when I learned programming. You had a bug. You copy the error, paste it into Google, land on Stack Overflow, read the top answer, scroll to the deleted comment that actually fixes it, paste the snippet into your editor and quickly edit the variable names to fit your script, and you pray. That was the workflow. Search, read, adapt, paste, iterate. Stack Overflow was the trust layer of the internet for programmers. Your IDE was the feedback loop. And for most of us, that IDE became VS Code pretty quickly after it went open-source. Which is an important detail since every AI coding tool you’re going to hear about in the next ten minutes plugs into that editor, forks it or competes against it.

Before LLMs, “AI in the editor” was mostly local autocomplete. Nothing really useful and definitely nothing near intelligent. Then in 2017 the Transformer paper dropped. Every AI coding assistant you see today is a descendant of that architecture, but it took a few years to make that happen, as we’ll see. By 2020, we had CodeBERT showing that you can learn natural language and programming language together, the first of its kind. But we didn’t have any interfaces to use it effectively.

The interface showed up on June 29, 2021, when GitHub announced Copilot, powered by OpenAI’s Codex. Not the Codex we know today, but a coding-specific language model trained for GitHub Copilot specifically. A week later, on July 7, OpenAI published the Codex paper. Copilot shipped as a technical preview and paper, pitched as your “AI pair programmer.” And on June 21, 2022, it went generally available for individual developers as a paid subscription. This was the first step towards a true coding assistant and the whole industry that spawned around it. But don’t get me wrong, nobody really liked it.

The unit of assistance shifted from “a snippet you searched for” to “a continuation you (sometimes) accept.” But there was something interesting here. You didn’t need to evaluate five Stack Overflow answers anymore. You just glanced at a ghost-text suggestion and hit Tab. GitHub’s own study said Copilot users finished a JavaScript task around 55% faster. Microsoft Research published a nearly identical result. 55.8% faster on the same kind of task. I remember that, because that 55% number was in basically every dev talk in 2022. Could we make coders more productive? That’s quite a big news based on how expensive software engineers are.

But, around the same time, a paper called “Asleep at the Keyboard?” systematically tested Copilot across security-critical categories and found the model was completely willing to recommend insecure code.

So, it made us more productive, but we couldn’t trust it. Interesting, maybe even promising, but nowhere near truly valuable yet.

It was still about having humans in the driver’s seat and improving code suggestions. Basically, it was about saving time on searching for the right code and not replacing coders. This was the beginning of the end for Stack Overflow.

Then November 30, 2022 happened. You know what this means… ChatGPT. And within a week, conversational coding became normal. You could describe a bug, ask for a refactor, paste an error, iterate in plain English, and Stack Overflow’s reaction told everything. First shift we saw is that people tried to grind the Stack Overflow leaderboards by answering more queries, but were just copy pasting ChatGPT’s answers. And, in December 2022, their moderators banned ChatGPT-generated answers because the responses were confidently wrong at scale and crushing moderation capacity. The core product of Stack Overflow was trust, and early LLMs walked in and undermined it.

This is where most people shifted from Stack Overflow to ChatGPT for debugging.

Similarweb reported Stack Overflow traffic declining alongside ChatGPT’s growth, including a widely-cited drop of about 14% month over month in 2023.

And in the middle of all that, in January 2023, Karpathy, a founding member of OpenAI and one of the most important people in the AI space, posted a line that became the cultural prequel to everything that comes next. “The hottest new programming language is English.” It instantly went viral.

Still, this was only ChatGPT. It was mostly used to debug functions and hardly worked across multiple files, let alone in a single big file.

Two months later, in March 2023, two things happened almost at the same time. OpenAI released GPT-4, which was a massive leap in reasoning and code generation quality. And a small startup called Anysphere, out of OpenAI’s own accelerator, publicly launched Cursor. A fork of VS Code, but built from scratch around one idea: the AI should see your whole codebase, not just the file you give ChatGPT, which sounds obvious now, but in early 2023, nobody was doing it. We were still copy pasting code into ChatGPT to get the new code to paste back into our code editor.

ChatGPT was basically a glorified Stack Overflow that gave us the same kind of code, but already adapted to our script, instead of having to adapt it for each variable name.

Cursor added deep codebase indexing and, over the following months, a feature called Composer that could do multi-file edits through a single prompt. It finally wasn’t about just autocompleting anymore. Neither just about chat. Actual coordinated changes across your project. Across files. And even cooler, we finally didn’t have to alt-tab constantly and copy-paste lines around. It was all within the same code editor.

Throughout 2023 and 2024, the competitive landscape started heating up. Replit shipped an AI assistant. Amazon launched CodeWhisperer. Google released Gemini Code Assist. But the two main tools that really pushed the SOTA were Cursor and a newcomer called Windsurf. Codeium, the company behind it, launched the Windsurf Editor on November 13, 2024, marketing it as “the first agentic IDE.” Also a VS Code fork, but with a different philosophy. Their core feature, Cascade, proactively watches your terminal output and your actions in the editor and suggests repo-wide changes before you even ask. Basically, a precursor to Claude Code. It was faster, more affordable, and by early 2026 it had over 700,000 developers. Some people picked Cursor for depth. Some picked Windsurf for speed. But both were doing the same thing. They were turning the IDE from a text editor into an orchestration layer managed by the developer.

And then, twelve days later, on November 25, 2024, Anthropic open-sourced something that changed the entire infrastructure layer underneath these tools. The Model Context Protocol, MCP. An open-source standard that connects AI applications to external systems. I covered MCPs in a video if you haven’t yet looked into these. You can use MCPs as connectors to databases, file systems, Slack, Jira, Drive, anything with an API. Anthropic called it “a USB-C port for AI.” By early 2026, the MCP SDK was getting around 97 million downloads per month, making it one of the fastest-adopted open-source projects in AI history. MCP is one of the reason agents can actually do things beyond reading and writing code files. Function calling gave LLMs hands, and MCP democratized it.

And, finally, 2025. This is the year everything truly changed.

By now, most developers were still using ChatGPT over Cursor and Windsurf, even though these new tools were quickly gaining traction, mostly due to trust issues about controlling the code going into their projects. But this was about to change…

On February 6 2025, Karpathy posted a 185-word thread on X that got over 4.5 million views. He described what he was actually doing with Cursor and SuperWhisper. Talking to the AI, barely touching the keyboard, hitting “Accept All” without reading the diffs, pasting error messages back in until things worked. He called it “vibe coding.” He gave a name to this new way of coding. It even entered the Collins English Dictionary as Word of the Year.

Yes, even Karpathy said it, it was nowhere near replacing developers, especially experienced ones. We had to approve all its work otherwise it was doing endless loops and breaking everything. It wasn’t autonomous but it was the start of actually being able to code without… coding.

Just days before that tweet, on February 3, 2025, Anthropic released Claude Code as a research preview. This is where the story shifts from tools that help you write code to tools that write code for you. Claude Code was only a terminal agent. No IDE required. It reads your codebase, modifies files, runs commands, and works inside the terminal. But the real innovation was not the agent itself. It was the conventions it introduced.

First, CLAUDE.md. A markdown file at the root of your project that the agent reads at session start. You put your coding standards in there, your architectural decisions, your preferred libraries. Quickly, people started using it. Every time the agent makes a mistake, you tell it to update its own CLAUDE.md so it never makes that mistake again. The agent writes its own governance over time. It started to be able to self-improve. But that led to a problem. claude.md file quickly became way too big for the model to be able to process.

This is what led to the creation of what we now call skills. Simple folders with a markdown inside. Think standard operating procedures, but for agents. How to generate a migration. How to write release notes. How to prep a PR. Reusable capabilities you can share across projects and teams. It’s basically just a recipe on how to do a task and everything it needs to know.

And this ties back to MCP. The model is the brain. CLAUDE.md is the memory. Skills are the capabilities. MCP is the hands. And subagents, running in parallel terminal sessions, are the team, each with their own memory, skills, and connections to external systems.

In May 2025, Claude Code went generally available alongside Claude 4. And the growth was immediate. It hit a billion-dollar annualized run rate within six months.

The competition between ChatGPT, Claude Code and Cursor-like systems, so chats vs. terminal vs. IDEs was at its peak.

On October 29, 2025, Cursor shipped version 2.0. Multi-agent architecture. Up to eight agents running in parallel using git worktrees, each in an isolated workspace so they don’t step on each other’s changes. A proprietary model called Composer, roughly four times faster than comparable models. Agents that can read the DOM and run their own end-to-end tests. This was not an IDE update. This was a third paradigm shift. Cursor crossed one million daily active users and was valued at 9.2 billion dollars after a 400 million Series B.

OpenAI was on the same track. On May 16, 2025, they relaunched Codex, not the old model from 2021, but a completely new cloud-based software engineering agent. It runs inside an isolated container with internet access disabled during task execution. You can inspect citations, terminal logs, and test results after each run. And they introduced AGENTS.md, conceptually the same idea as CLAUDE.md. The entire industry converged on the same pattern: give the agent persistent memory and let it learn from its mistakes.

We were increasingly removing the developer from the loop. Little by little.

Even code reviews were being automated with adversarial agents like CodeRabbit, which run security audits and vibe checks on everything the execution layer produces.

This is where Boris Cherny’s quote from the start of the article about Claude Code was made. He says that, since November 2025, 100 percent of his own code has been written by Claude Code. And on the Claude Code team, roughly 95 percent of their shipped code is written using Claude Code itself. The tool builds itself. Humans review and steer, but the ratio of typing to supervising has completely flipped.

And then the guy who started it all confirmed the shift himself. In January 2026, Karpathy posted that in just four weeks, between November and December 2025, he went from writing about 80 percent of his code manually with AI handling maybe 20 percent, to the complete opposite. 80 percent agents, 20 percent edits and touchups. By mid-March 2026, he said he hadn’t typed a single line of code since December. He was running up to 20 agents in parallel, directing them like a team lead, managing intent, context, and direction while they did the actual coding. He called it “AI psychosis” and said it was the biggest change in his working practice in 20 years. The man who named vibe coding in February 2025 using it for fun, was, twelve months later, fully living it.

At the same time, still in January 2026, the consumer-agent wave hit. An Austrian developer named Peter Steinberger launched an autonomous agent, first called Clawbot, then Moltbot, then renamed OpenClaw to dodge trademark conflicts with Anthropic’s Claude. 180,000 GitHub stars in days. Unlike passive chatbots, OpenClaw actually does things. Checks your calendar. Books flights. Manages coding workflows without hand-holding. In February 2026, Steinberger joined OpenAI. We were finally going from coding agents and terminal work to having agents do everything from start to finish.

Thanks to Clawbot and other alternatives, researchers found hundreds of malicious extensions in agent marketplaces like ClawHub. The security community started insisting that agents be treated as non-human identities with strict access controls.

And this is not just an OpenClaw problem. In late March 2026, Anthropic accidentally exposed a large portion of the Claude Code TypeScript source via an npm source map, which I also covered on the channel.

So where does this leave us in April 2026. GitHub Copilot has 20 million users. 92 percent of US developers use AI coding tools daily. Between 41 and 46 percent of new code is AI-generated. 59 percent of developers use three or more AI tools in parallel. Pull request cycle time is down around 75 percent. This is not an early-adopter story anymore. It is the default. Developers are increasingly going towards Claude Code in terminal and even non-coders use it through the Desktop app with Cowork. Code is more accessible than ever, yet less accessible than ever, with only agents knowing how to actually code.

But, at the same time, between 46 and 71 percent of developers distrust AI output accuracy. And remember what I told you to keep in your head at the beginning? The METR study. A randomized controlled trial found that developers using AI were actually about 19 percent slower at completing complex tasks. While believing they were about 20 percent faster. So the perception gap is real. AI tools hijack the reward system. You remember the quick wins and feel productive, generating a boilerplate file in seconds.

Enrico Papalini put a name on this with his “70% Problem.” AI cleanly delivers about 70 percent of an application. Scaffolding, boilerplate, plausible tests. The remaining 30 percent, the production-grade edge cases, the security, the architectural integrity, still needs deep human engineering. That is where vibes stop helping.

Which leads us to another growing issue. Employment among 22 to 25 year old developers fell by nearly 20 percent between 2022 and 2025. The tasks juniors used to learn on are exactly the tasks AI ate first.

So, five years ago you wrote code. Four years ago you accepted suggestions. Two years ago you prompted. One year ago you got a name for it, and it sometimes, unpredictably, worked. Two months ago, you supervised and reviewed. Today, some people just press go and let the agents be.

Agentic engineering is the discipline that has to come next, and it is still getting written, in real time, by the same people using the tools. The goal is to use these agents in the right way and not purely let them be and blindly accept everything. Developers are now getting paid for their expertise and decision-making, not for coding capabilities and efficiency. Claude can do that. Which means you still need to build expertise and use your brain. Don’t get behind blindly using agents, or a junior, or even an upcoming agent, will replace you.

I’d love to know where is your vibe-to-engineering ratio right now? And what tools are you using? Let me know in the comments!

On my end, I’m now fully into Claude Code for coding tasks and use it for almost everything, with improvement loops in each. I’ll soon share a video on exactly how I, and our team at Towards AI, use coding agents when building for our clients. If that sounds interesting to you, consider subscribing to the channel! If you do so, I’ll see you in the next one!

Sources & further reading:

WIRED on Claude Code and Boris Cherny: https://www.wired.com/story/claude-code-success-anthropic-business-model/
GitHub Copilot launch: https://github.blog/news-insights/product-news/introducing-github-copilot-ai-pair-programmer/
GitHub Copilot productivity research: https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/
“Asleep at the Keyboard?” Copilot security study: https://arxiv.org/abs/2108.09293
ChatGPT launch: https://openai.com/index/chatgpt/
Stack Overflow ban on ChatGPT-generated answers: https://meta.stackoverflow.com/questions/421831/policy-generative-ai-e-g-chatgpt-is-banned
Similarweb on Stack Overflow traffic decline: https://www.similarweb.com/blog/insights/ai-news/stack-overflow-chatgpt/
GPT-4 announcement: https://openai.com/index/gpt-4/
Cursor 2.0 announcement: https://cursor.com/blog/2-0
Windsurf Cascade: https://windsurf.com/cascade
Anthropic Model Context Protocol: https://www.anthropic.com/news/model-context-protocol
Claude Code overview: https://code.claude.com/docs/en/overview
Claude Code Skills: https://code.claude.com/docs/en/skills
OpenAI Codex relaunch: https://openai.com/index/introducing-codex/
Karpathy and the origin of “vibe coding”: https://blog.vibecoder.me/history-of-vibe-coding-from-karpathy-tweet-to-industry
Collins Word of the Year 2025: https://blog.collinsdictionary.com/language-lovers/collins-word-of-the-year-2025-ai-meets-authenticity-as-society-shifts/
Claude Code source leak: https://www.infoq.com/news/2026/04/claude-code-source-leak/
METR / AI coding productivity paradox: https://www.cerbos.dev/blog/productivity-paradox-of-ai-coding-assistants
The “70% Problem”: https://medium.com/@enrico.papalini/the-ai-coding-revolution-has-a-dirty-secret-and-the-data-proves-it-167ad93255a1
GitHub Copilot crosses 20M users: https://techcrunch.com/2025/07/30/github-copilot-crosses-20-million-all-time-users/

FAQ

What is vibe coding?

It is using AI agents to turn high-level intent into code quickly, often with less manual typing from the developer.

How do you use coding agents safely?

Give them good context, small tasks, tests, review steps, and project rules, then read the diff before trusting it.

What is the danger of vibe coding?

The danger is moving fast without understanding the system, which can leave hidden bugs, security issues, or messy architecture.

What is the useful lesson from Coding Changed Forever?

AI coding agents changed how software gets built, but vibe coding breaks quickly without reviews, tests, context, and taste.

What makes AI coding agents and vibe coding reliable beyond a demo?

Every time the agent makes a mistake, you tell it to update its own CLAUDE.md so it never makes that mistake again.

What is the easy mistake with AI coding agents and vibe coding?

Stack Overflow was the trust layer of the internet for programmers.

How should builders use AI coding agents and vibe coding?

Don’t get behind blindly using agents, or a junior, or even an upcoming agent, will replace you.

When does AI coding agents and vibe coding become useful in practice?

A markdown file at the root of your project that the agent reads at session start.

What should beginners understand about AI coding agents and vibe coding?

But that led to a problem. claude.md file quickly became way too big for the model to be able to process.

What is the common mistake with AI coding agents and vibe coding?

Stack Overflow was the trust layer of the internet for programmers.