Many articles have been written about the levels, or ladders, of AI usage. And they’re often framed as if ascending to the peak level is the goal. The metaphor encourages this instinct. I want to be at the top! But it’s important to reframe this so that we see the level as a property of the problem, not the person.
You’re not a Level 2 user, rather the problem you’re facing requires a Level 2 approach.
For example, writing this article required a Level 0 approach for a quick brainstorm, and Level 3 approach for final editing, review, and publishing, both within the same day.
Choosing the simplest tool (or rung) that solves the problem is a critical judgement considering the costs. And those costs are not just dollars, but also oversight and mental tax that comes with added complexity and volume of output. I’ve seen a one-paragraph task get handed to an orchestrated set of sub-agents that took longer to set up, cost more to run, and produced output that needed more review than just writing the thing would have. The work got done. It just got done worse, slower, and at higher cost, but it was dressed up as sophistication.
There is a progression from chat to agents to managed workflows, and it’s worth understanding. But it’s not a road you’re supposed to travel beginning to end. It’s a ladder, and the goal is not to reach the top. The goal is to stand on the lowest rung that reliably solves the problem in front of you.
01 · The mapThe five levels.
I’ve seen these represented in as few as four and as many as ten. There is no standard, this is just how I think about them in particular.
| Lvl | Name | The user’s posture | What’s new |
|---|---|---|---|
| 0 | Chat | “Help me think.” | Conversation, reasoning, drafting |
| 1 | Workspace / Copilot | “Help me change this.” | AI acts directly on your files, code, or docs |
| 2 | Connected / Skilled Agents | “Do it the way we do it.” | Reusable skills, connectors, memory, standards |
| 3 | Delegated Work | “Break this apart.” | One agent decomposes work to specialists |
| 4 | Managed Workflow | “Run this process.” | Explicit roles, handoffs, gates, and review |
The shifts, in plain language: help me think → help me change this → do it our way → break this apart → run this process.
Level 0 — Chat.
You bring the context, ask, push back, and decide what to do with the answer. It’s flexible, fast, and requires zero setup. The limit: you are the integration layer: you paste the context in, and you carry the output back out by hand.
Level 1 — Workspace.
This is the “copilot” step. Here, the AI starts making changes directly in your filesystem: editing a file, running a test, showing a diff. This is where “agent” starts to mean something, not because it’s autonomous but because it can act in context. Finally, no more copy/pasting! The limit: it’s generic. It doesn’t know your standards or your systems, and you find yourself repeating the same instructions.
Level 2 — Connected / Skilled Agent.
In this step the agent gets context and skills: connectors to the systems where your work actually lives, skills that encode your playbooks, memory of your conventions. The request shifts from “update this code” to “implement the next backlog item using our plan-first process, follow the repo conventions, and prepare the PR summary in our format.” This is where AI becomes organizationally useful: it doesn’t just edit a file, it does the work the way you want it done. The limit: one agent holding planning, execution, testing, and critique in a single context eventually gets crowded, slow, and can begin to contradict itself.
Level 3 — Delegated Work.
One agent decomposes the job and hands pieces to specialists: a planner, an implementer, a reviewer, a skeptic. The real value here isn’t simply parallelism, it’s specialization: each sub-agent gets its own instructions, context, tools, and standard for “good.” The limit: coordination cost. Sub-agents duplicate work, misread the assignment, and generate more material than you wanted to read.
Level 4 — Managed Workflow.
The work moves through a designed process: defined triggers, required context, roles, artifacts, checks, and the gates where a human approves or intervenes. This is not a more powerful agent, it’s a system of work. It’s the right level when the same sequence runs repeatedly, intermediate artifacts matter, and mistakes have consequences.
02 · The economicsThe costs don’t rise, they jump.
The five levels are a menu of capability. They’re also a menu of cost. Every rung up multiplies two things: the complexity of the system and the volume of output it produces.
Complexity is obvious. A chat window has no moving parts. A managed workflow has triggers, roles, connectors, handoffs, and gates. All of these things can break or become out of date, adding another thing you now have to understand, monitor, and maintain.
Volume is the sneaky one. A single agent gives you one answer. Four sub-agents give you four, plus all the intermediate artifacts they pass between them. Climbing the ladder doesn’t always make the work better; but it does always make more of it, and someone (usually you) has to deal with all of it.
That’s where the bill comes due, in three places:
Oversight.
For most tasks, AI still isn’t good enough to “ship to main”. What it is good at is generating a lot of output that it is confident is ready to ship. This becomes a tax that often shows up unexpectedly. Not just in time, but in your focus & attention. This is similar to the tax managers face: decisions, context-switching, coordination, alignment. When individual contributors (ICs) begin to run into this they begin to realize that their job has changed: they’re no longer only doing the work, they’re managing the work the AI produced. This creates many challenges, but for this article’s agenda the main concern is that it is new work.
Coordination.
Once work is split across agents, something has to route it, reconcile conflicting outputs, and decide what happens next. That orchestration is real. And it’s ongoing engineering, not just a one-time setup.
Dollars.
More context, tools, agents, retries, and review loops all mean more model calls and token usage, plus the tooling and permissions to support them. This can show up as a surprise because it is uniquely opaque compared to other productivity tools we’ve all used in the past. Costs are not always tied to a fixed monthly subscription, but can be pooled, have different rates for API vs user subscription vs model, etc. It’s a consumption model, and yet there is no visibility to the user of what a specific task or workflow might cost.
And these costs don’t climb a smooth slope, they jump. Levels 0 through 2 are tame. The discontinuity is between Level 2 and Levels 3–4, where governance (permissions, logging, audit trails, approvals), human attention, and token usage all take a big leap at once. That’s the tradeoff in a sentence: climb a rung only when the value clearly beats the complexity, volume, and oversight it brings. And that’s much easier said than done.
03 · The disciplineYou’re allowed to climb back down.
The ladder runs both directions. One of the most useful moves we make is abandoning a workflow that isn’t earning its complexity and dropping the task back to a skilled agent, or even to plain chat. If a process is generating more review than value, demoting it isn’t failure, it’s the discipline the framework is built on. Standing on a lower rung on purpose is a skill.
04 · The ruleThe decision rule.
Move up when several of these are true at once: the task repeats, copy-paste is the bottleneck, it always touches the same sources, the output has to be saved or tested or shipped or audited, the work has independent parts, quality review genuinely matters, or your people are spending more time coordinating than deciding.
Stay put — or climb back down — when the task is one-off, the stakes are low, you can’t yet describe the artifact you want, you don’t know how you’d review the output, or you’re reaching for more agents because it sounds sophisticated. That last one is the tell.
The question is not whether a more advanced setup could solve the problem. It probably could. The question is whether the problem deserves the setup.
05 · ClosingThe best teams know which rung a problem deserves.
The temptation with AI is to keep reaching for the most advanced version of the idea: more agents, more autonomy, more automation. The best teams are learning to manage the tradeoffs. They start with chat when they need a thought partner, give the AI hands when it needs to touch the work, add skills and connectors when context and standards start repeating, delegate when the work splits into real specialties, and design a managed workflow only when the process is important and recurring enough to deserve the gates.
The best teams won’t be the ones who always sit atop the ladder, they’ll be the ones who know which rung a problem deserves, and who aren’t too proud to stand on a low one.
06 · Up nextDesigning managed workflows that work.
A full article on this is in progress and will follow shortly — here are the basics for now.
Designing a Level 4 workflow, and deciding whether an agent team is the right shape for it, is a big topic in its own right. There are at least three ways to organize the work, and every one of them still requires a decision about where the human sits in the loop:
- Single-agent workflow: one agent owns it end to end. Good when the process is simple, narrow, and low-risk.
- Orchestrator-agent workflow: one lead agent owns the plan and delegates to specialists as temporary helpers. Usually the best first workflow because it’s easy to understand, debug, and govern.
- Agent-team workflow: multiple agents hold standing roles with durable ownership, handoffs, and review relationships.
Reach for agent teams only when the outcome recurs, maps naturally to standing roles that need different context, skills, and tools, and is important enough to justify the design, governance, and cost. Short of that, an orchestrator is usually the better call. More on all of this in the follow-up.
AppendixOne example, all the way up the ladder.
To make the rungs concrete, here is a single deliverable — a weekly client status report — at every level.
- Level 0 — Chat. You paste in your notes and ask the AI to shape them into a clean update. You’re still the one gathering the facts.
- Level 1 — Workspace. The AI edits the status document directly instead of handing you text to copy back.
- Level 2 — Connected agent. It pulls the week’s activity from your project tracker, your repo, and last week’s report, then drafts the update in your house format and tone — no manual gathering.
- Level 3 — Delegated work. A progress sub-agent summarizes what shipped, a risk sub-agent surfaces blockers, a scope sub-agent flags changes against the plan, and an editor sub-agent tightens it for an executive reader.
- Level 4 — Managed workflow. Every Friday it runs as a process: collect activity, summarize progress, identify risks, compare to plan, draft the report, flag anything sensitive for a human, and hold at a gate for approval before it goes out.
Most teams don’t need Level 4 here, and almost none need an agent team for it. A skilled Level 2 agent handles a weekly status for most accounts beautifully. You climb to Level 4 only when you’re running this across many clients, every week, and consistency and auditability start to matter more than flexibility.
Founder of r90
Former CTO of Vanco
Writes about the method underneath modern software companies and engineering organizations. Read more →