The vibe coder who changed everything

A couple of months before release on a project I’d been polishing for months, my client brought in an AI consultant. The team was building AI features into parts of the product, and this guy was supposed to help us optimize them.

I was curious. The way he was introduced (on top of the latest releases, building projects, optimizing workflows) made him sound like a heavy hitter. I suggested the team do a prep session with him. Let’s see how we can speed things up.

Then the call started.

What I thought was a seasoned specialist turned out to be a vibe coder who had talked his way into the right rooms. He wasn’t bad at what he did, but his technical depth was… thin. He couldn’t explain why things worked, just that they did.

But here’s the thing that stuck with me: even with zero technical knowledge, the code he was producing with Claude Code was surprisingly good. Not perfect, not production-ready, but far beyond what I expected from someone who couldn’t explain a for loop.

And that’s when the question hit me:

“If he can do that without knowing what he’s doing, what am I going to be able to do with my knowledge?”

That question ruined me. In the best way.

Starting small: the trust curve

I didn’t dive in headfirst. Like most engineers, I started cautious.

At first, I used Claude Code the same way I used Copilot’s autocomplete. Nothing fancy, just saving myself the tedium of writing repetitive boilerplate. The kind of code that’s boring to write but necessary. Everyone has those files.

Then I started pushing a little further. I’d define a function signature (parameters, return type, some pseudocode describing the logic) and let the agent fill it in. Small helper functions at first. Things I could verify in seconds.

I tested everything obsessively before integrating anything. That part was non-negotiable. But the functions were correct more often than not, and each time one passed my tests, my trust grew a little.

Within weeks, my workflow shifted without me fully realizing it. I was still writing the important logic myself (the business rules, the core decisions) but everything repetitive or pattern-based? I handed that off. A lot of the codebase followed the same patterns with different names, so once I verified the agent understood the pattern, I let it run.

The moment I noticed I was getting significantly more done in the same hours, something clicked. This went way beyond autocomplete on steroids. A completely different way of working.

Enter Opus 4.5

If you were using Claude Code before and after Opus 4.5, you know. It felt like a before and after in how software engineers related to code agents.

I was already comfortable with my workflow, but Opus 4.5 changed what was possible within it. The model could hold more context, reason about architecture, and understand the relationships between different parts of a codebase in ways earlier versions couldn’t.

I started seeing people online using plan mode to make serious, structural changes to professional codebases. Not toy projects. Real production code. So I tried it.

The first time I built an entire feature using only plan mode (specifying what I wanted, reviewing the plan, approving it, and testing the output) without writing a single line of code myself, I sat there for a minute just processing what had happened. I’d gone from writing 100% of my code to writing maybe 10%, and the quality wasn’t worse. In some cases, it was better, because the agent was more consistent with patterns than I was when tired at 11pm.

That realization (that this was the floor, not the ceiling) sent me down the rabbit hole for real.

Building the workflow

I started studying how the best developers were using these tools. Taking notes, trying different approaches, adapting what worked to the project I was actually shipping.

First, I set up the main repository with proper context files: architecture docs, patterns, conventions. Within days, the agent had a working mental model of the entire project across multiple repos. Then I connected two crucial MCPs: our database and AWS, since most of the project’s infrastructure lived there.

Now the agent wasn’t just writing code in a vacuum. It could query the database schema, understand the deployment stack, and plan features with the full picture of how the system actually worked.

My daily workflow became:

1. Start with a Q&A session. I describe the feature or fix with as much technical detail as possible. Requirements, constraints, edge cases I already know about.

2. Answer every clarification question. This is the part most people skip, and it’s the most important step in the entire process. When people complain about AI hallucinating (making up APIs that don’t exist, using patterns that don’t fit the codebase) it’s almost always because they didn’t give it enough context. If you treat the agent’s questions as annoying interruptions instead of crucial calibration, you’ll get garbage out. Answer them thoroughly.

And this is where MCP integrations become a game changer. Remember those database and AWS connections I set up? Those aren’t just nice-to-haves. They directly feed into this step. Instead of me manually explaining “this table has these columns” or “this Lambda does X,” the agent can query the database schema itself, check the actual AWS resources, and build its understanding from the real source of truth. MCP turns context that you’d normally have to type out by hand into something the agent can just look up. The fewer things you have to explain manually, the fewer gaps there are for hallucinations to fill. It’s the difference between describing your house to someone over the phone and handing them the keys so they can walk through it themselves.

3. Review the plan before any code is written. Sometimes the agent proposes patterns that don’t align with the codebase or makes architectural choices I’d do differently. Catching this at the plan stage costs minutes. Catching it after implementation costs hours.

4. Let it execute, but watch. I keep an eye on what commands it’s running and what files it’s touching. Not micromanaging, just making sure nothing weird is happening. If something weird starts happening (calling the wrong tools, trying another approach that doesn’t follow best practices, skipping steps just to ship code) I stopped it right there, and give it the right direction.

5. Test everything. Early on, I tested manually. Now I have the agent write end-to-end tests for every feature, covering all the paths I can think of. Then I review the tests themselves to make sure they’re actually testing what matters.

This isn’t a perfect workflow. I know it can be optimized further. But it’s where I am right now, and the output speaks for itself.

Suffering from success

I was running all of this on Anthropic’s $20/month Pro subscription. It worked, but the MCP servers were eating context fast, and I kept hitting usage limits at the worst times. Mid-feature. Mid-debugging session.

The math was simple: if this tool is saving me hours every day, paying more for uninterrupted access is a no-brainer. I upgraded to the $100/month Max subscription.

Plot twist: I haven’t hit a limit since.

In fact, I ran into the opposite problem. Every week when usage reset, I’d see I was at maybe 20% utilization. I was underspending on a subscription designed for power users.

That felt wrong. I’m paying for capacity I’m not using.

So I started looking into what other developers were doing with their spare usage, and I found people running Claude Code overnight in headless mode, looping through tasks while they slept.

I decided to try it with something I’d been wanting to do anyway: research for side projects. I had ideas for games, apps, tools. Things I wanted to build eventually but never had time to properly investigate.

I started queuing up research prompts every night before bed. Each one explored a different idea from multiple perspectives: market viability, technical feasibility, existing competition, architecture options. By morning, I’d wake up to a full report with synthesized findings and conclusions.

The amount of useful, structured information I was getting while sleeping was frankly absurd. Ideas I’d been sitting on for months suddenly had feasibility analyses, competitor breakdowns, and technical roadmaps.

I was hooked. But there was still one limitation I couldn’t shake: all of this required me to be at my computer. Every interaction, every prompt, every review: I had to be sitting at my desk.

And then I found something that changed that entirely.

But that’s a story for the next post.

This is part 1 of a series about my journey with AI-assisted development. Next up: how a persistent AI companion broke me free from the desk — and what happened when I gave it access to everything.