Back to articles
10 min read

How I Accidentally Became a Team Manager

How planning mode, subagents, Ralph loops, and review changed my day-to-day role from developer to agent team manager.

Gert Jansen van Rensburg

Gert Jansen van Rensburg

Software Consultant

Illustration of a developer at a desk orchestrating multiple digital agent panels around a workflow loop

Somewhere in the last six months, my job changed.

Not in title. Not in the org chart. Not in the way Teams reports it.

But in the way the work actually happens, the shift has been hard to ignore. I’m still a software developer, but more and more of my day feels like managing a small team. The strange part is that the team is mostly agents and subagents, and my main job is to keep them pointed at the right outcome.

I didn’t set out to become a team manager. It happened slowly, one workflow improvement at a time.

At first, I used agents the obvious way: give them a task, let them write code, review the output, fix the rough edges. That was useful, but it still felt like a faster version of the same developer loop.

The bigger change came when I stopped treating the agent as a code generator and started treating the main chat as the team lead.

That main thread became the place where the story lives. It holds the context, the questions, the tradeoffs, the decisions, the plan, the implementation status, and the final review. The subagents do focused work inside that frame, but the main thread keeps the shape of the work intact.

That’s where the job started to feel different.

Start With the Story, Not the Code

The loop usually starts with a ticket or user story.

I bring the ticket into planning mode and let the agent interrogate it. What’s the actual goal? What are the acceptance criteria? What’s in scope? What’s out of scope? What can be inferred from the codebase, and what still needs a human decision?

This part matters more than it sounds.

In the past, I’d often read a story, build a mental model, and start implementation once I felt I had enough context. Now I make that thinking explicit. The agent asks questions. I answer. It inspects the repo. It updates the plan. We keep iterating until both of us agree on what needs to be implemented.

That sounds slower, but it saves time later.

Most implementation mistakes don’t come from bad syntax. They come from unclear intent. The planning loop forces the ambiguity to the surface before code starts moving.

By the time implementation begins, the work is no longer “go build this vague thing.” It’s a shared plan with decisions already made.

The Main Thread Becomes the Orchestrator

Once the plan is clear, the main chat becomes the orchestrator.

This is the part that has changed my role the most.

I don’t want every agent trying to solve every problem at once. That gets noisy quickly. Instead, the main thread holds the goal and delegates smaller pieces of work: architecture inspection, visual review, code review, tests, or acceptance criteria.

I had already worked out how to run these agents in parallel without them colliding. Orchestration is the layer that sits on top: deciding what each one should work on and keeping the results coherent.

The important part is that the main thread stays in charge of the workflow.

It knows what the original story asked for, what decisions were made in planning, which checks have passed, which risks remain, and what still needs human judgment.

That makes it feel less like “ask AI to code” and more like running a delivery loop.

The agents do the work, but the main thread keeps the work coherent.

Ralph Loop in Spirit

The closest name I have found for this pattern is the Ralph loop.

Geoffrey Huntley writes about it in everything is a ralph loop, and the idea clicked for me because it described what I had started doing manually: give the system context and a goal, observe the result, feed the findings back into the loop, and keep going.

I’m not running some grand autonomous software factory.

Most of the time, my loop is much more boring:

  • clarify the story
  • make a plan
  • implement the plan
  • review the implementation
  • simplify what can be simplified
  • verify against the original acceptance criteria
  • ask what was missed

Then I throw the findings back into the loop.

If the code review finds a risk, that becomes the next goal. If Playwright catches a visual issue, that becomes the next goal. If the implementation works but feels too complex, simplification becomes the next goal.

The useful part isn’t magic. It’s repetition with context.

Each loop tightens the work.

The Old Parts Still Work

The more I use this workflow, the more I realise it’s not a brand-new way of building software.

It’s mostly the good parts of software development made more visible.

Clarify the intent before building. Slice the work small enough to reason about. Review the change. Simplify where the first version got heavy. Test the important paths. Check the result against the original acceptance criteria.

None of that is new.

Good teams have been doing those things for years. The difference is that agents make the middle part faster, so the surrounding discipline matters more. If implementation becomes cheap, then clarity, review, taste, and verification become the expensive parts.

That’s why this workflow works for me.

It’s not trying to replace the craft with prompts. It’s taking the practices that already worked and making them harder to skip. Planning is explicit. Review is explicit. QA is explicit. The loop gives the work a shape.

That shape is what keeps the speed useful.

Asking What Can Be Simplified

Another question I have added to the workflow is this:

What can we simplify?

I ask it after the implementation is done, not before.

Before implementation, simplification is often theoretical. After implementation, the code is real. The agent can inspect the diffs, look at the shape of the changes, and point out where we added too much, duplicated logic, overfit the solution, or made the code harder to review than it needed to be.

Sometimes the answer is useful. Sometimes it’s noise.

That’s fine. The point isn’t to accept every suggestion. The point is to create a deliberate pause between “it works” and “it’s ready.”

That pause is where a lot of quality comes from.

I review the simplification suggestions, keep the ones that make sense, reject the ones that do not, and then send the useful ones back through the loop.

Reviewing Against the Original Acceptance Criteria

The final part of the workflow is where the team lead feeling really kicks in.

Once the implementation is complete and simplified, I ask whether we meet the original acceptance criteria.

Not the plan as it evolved.

Not the code as it currently exists.

The original story.

That distinction matters because agents are very good at following the latest context. Sometimes too good. If the conversation drifts, the agent can optimise for the plan we created and forget the business outcome that started the work.

So I bring the acceptance criteria back into focus.

Depending on the work, that review might be static analysis, a code review, a test run, or a visual review with Playwright. For UI work, I want screenshots, console checks, and a design review. For backend work, I want tests, edge cases, and a review of how the changes fit the existing architecture.

Subagents can inspect different angles, but the main thread keeps asking the boring delivery questions:

  • Did we build what the story asked for?
  • Did we verify the important paths?
  • Did we introduce unnecessary complexity?
  • Did we leave any obvious QA gaps?

None of that is glamorous, and none of it makes for a good demo. But it’s exactly where the value lives.

Did We Miss Anything?

I keep one more question for the very end, and it’s the riskiest one I ask:

Did we miss anything?

I know that’s dangerous. It’s open-ended, and an agent will almost always return with something. If you let that question run wild, it can generate endless imaginary risks.

But used carefully, it has been useful.

The reason it works is that the main thread has the whole workflow in context: the story, the plan, the implementation, the review findings, the simplification pass, the tests, and the verification steps. It can look across the whole delivery path and point out gaps that are easy to miss when you are deep in the code.

It might notice that we never checked a mobile breakpoint. It might notice that a test only covered the happy path. It might notice that we reviewed the implementation but never compared it back to one acceptance criterion.

I still make the judgment call.

But the question gives me one last chance to catch the kind of thing that would otherwise show up in review, testing, or production.

The AI Sandwich

I recently read a LinkedIn post from Damian Maclennan about the AI Sandwich, and it lined up closely with my current mental model.

AI is getting very good at the middle of software development. Give it a clear enough request and it can produce a lot of working code.

The harder parts are still on either side of that: knowing what you are actually trying to achieve, and knowing how to debug, review, and reason about the output when something goes wrong.

The developers who struggle most with agents aren’t always the ones who write bad prompts. Often, they’re the ones who don’t yet have a clear enough picture of the outcome, or enough system experience to know when the generated answer is wrong.

That’s why this workflow has made me more deliberate, not less technical.

The agent can do more of the implementation, but someone still has to decide what should exist, what tradeoffs are acceptable, and whether the result belongs in the system. Someone still has to review the work when the happy path passes but the design feels off.

That’s the gap this loop is trying to cover.

The Role Shift

This is the part I keep coming back to.

My role has not become less technical. If anything, it has become more technical in different ways.

I still need to understand the code. I still need to spot bad abstractions. I still need to know when a test is meaningful and when it is just there to make the output look complete. I still need to understand the system well enough to know when the agent is confidently wrong.

But the shape of the day has changed.

I give clearer instructions. I wait for work to complete. I review. I ask for changes. I review again. I test. I check the acceptance criteria. I decide whether the work is done.

Then I kick off the next cycle.

That sounds a lot like team leadership.

The difference is that the team is made up of agents, and the feedback loop is much faster.

That doesn’t remove accountability. It sharpens it.

At the end of the day, my name is still on the pull request. The agents can plan, implement, review, and verify, but I’m still responsible for the result.

That’s the biggest lesson from the last six months.

AI didn’t make me stop being a developer. It made me far more deliberate about how I manage the work.

Comments

Join the conversation on Bluesky.