Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

What is loop engineering?

Loop engineering is the practice of designing AI agent workflows as autonomous feedback loops — where the agent plans, executes, evaluates its own output, self-corrects, and continues without human intervention in each step. Claude Code's /loop and /goal commands enable this. The pattern can run for hours. It is genuinely powerful for complex long-horizon tasks. It is also the single highest token-burning usage pattern available to consumers today.

Why are AI companies pushing agentic workflows?

Because agents burn dramatically more tokens per task than simple chat interactions. A single agentic loop session — planning, executing, self-correcting, verifying — can consume 10x to 100x the tokens of a one-turn chat response. Since AI companies charge per token consumed (directly via API pricing or indirectly via subscription usage limits), higher token consumption per task means higher revenue per user. The push toward agents, harnesses, loop engineering, and computer use is genuinely useful for users — and also happens to align perfectly with the AI companies' business model incentives.

How many more tokens do agents use than chat?

It varies enormously by task but the rough multipliers: a simple one-turn chat question uses ~500-2,000 tokens. A multi-step Claude Code session on a non-trivial task uses ~10,000-100,000 tokens. A full Codex computer-use workflow (write code, push to GitHub, navigate Vercel, configure and deploy) can use hundreds of thousands of tokens as the model observes browser state, generates actions, and self-corrects through each step. Loop engineering patterns that run autonomous coding sessions over hours use millions of tokens. The gap between "I asked the AI a question" and "the AI ran my computer for an hour" is 3 to 4 orders of magnitude in token consumption.

Is this bad for users?

Not inherently. Agents complete tasks that simple chat cannot. The value delivered per token is genuinely higher for agentic workflows — you get a deployed landing page, not just code. The risk is that the incentive alignment makes AI companies less likely to optimize for token efficiency, and more likely to encourage agentic patterns even when a simpler interaction would serve the user just as well. Users should calibrate: use agents for tasks that warrant them, and be aware that "try the agent first" advice from companies whose revenue depends on token consumption is not neutral advice.

Why AI Companies Push Agents: The Token Economics | explainx.ai Blog

When OpenAI launched Codex, they did not just ship an AI coding tool. They shipped an agent harness that can navigate web browsers, operate desktop apps, connect to GitHub, deploy to Vercel, and schedule its own recurring tasks.

When Anthropic shipped Claude Code, they did not just give you a terminal assistant. They gave you /loop, /goal, multi-day autonomous sessions, and a pattern they call loop engineering — where Claude plans a task, executes it, evaluates the output, self-corrects, and runs again, without you in the loop.

When every AI company builds agent frameworks, harness tooling, and computer-use capabilities, you might ask: why are they all going the same direction so fast?

The answer has two parts. The first part is genuinely true: agents complete tasks that simple chat cannot. Alex Finn launching a live landing page in 5 minutes is real. That is value a chat interface could not deliver.

The second part is also genuinely true, and gets discussed much less: agents burn tokens at a rate that transforms the business economics of AI companies.

Let's look at both parts honestly.

How AI Companies Actually Make Money

At the core, every major AI company has the same revenue model: they charge for compute, measured in tokens.

The delivery mechanism varies:

Direct API pricing: $X per million input tokens, $Y per million output tokens
Subscription tiers: Higher plans = more tokens per month before limits kick in
Usage overages: Pay per token above your plan limit
Enterprise contracts: Bulk token commitments

The abstraction layer (subscriptions, tiers, "unlimited" plans) obscures the underlying unit economics. But the unit is always tokens. More token consumption per user = more revenue per user.

This is not a secret. It is in every earnings call, every pricing page, every infrastructure investment announcement. The question is what it means for the products these companies build.

The Token Gap Between Chat and Agents

Here is the gap that makes agent products so attractive to build:

Interaction type	Approximate token consumption
Simple chat question	500 – 2,000 tokens
Complex research question	2,000 – 10,000 tokens
Claude Code: fix a bug in one file	5,000 – 20,000 tokens
Claude Code: multi-file refactor	20,000 – 100,000 tokens
Claude Code: `/loop` session, 30 minutes	100,000 – 500,000 tokens
Codex: full computer-use workflow (code → GitHub → Vercel → live)	200,000 – 1,000,000 tokens
Loop engineering: overnight autonomous coding session	2,000,000 – 10,000,000 tokens

The range is enormous. A power user running loop engineering overnight consumes roughly the same tokens as 5,000 people asking a simple chat question. From a revenue perspective, that is an extraordinary concentration of value in high-engagement agentic users.

This is why every company is building toward agents.

The Products Pushing You Toward More Tokens

Claude Code + Loop Engineering

Anthropic built /loop, /goal, and multi-day autonomous sessions specifically for the highest-token patterns possible. Loop engineering — the practice of designing agent workflows as self-correcting feedback loops — is, from a pure token-consumption perspective, the most expensive interaction pattern available.

The ExplainX blog on loop engineering for coding agents describes sessions that run for hours, burning through enormous context windows as Claude plans, executes, evaluates, and retries. It is genuinely useful. It also happens to be the thing that transforms a casual $20/month subscriber into a heavy API user with meaningful overage billing.

Codex + Computer Use

OpenAI's Codex wraps every computer-use interaction in observation loops: the model sees the screen, generates an action, observes the result, generates the next action. Each observation-action pair costs tokens. A workflow that navigates five web apps burns those tokens for every page render, every state observation, every decision.

Alex Finn's five-minute landing page launch was described as effortless. From a token consumption standpoint, "effortlessly" navigating GitHub, Vercel, and a domain registry through a vision model observing browser screenshots is extremely expensive. The pricing plan absorbed it. If it happened via API, the bill would be substantial.

Open Source Harnesses

The open-source ecosystem (LangChain, AutoGPT, CrewAI, various "open claw" and agent harness frameworks) replicates and amplifies the same patterns. Multi-agent loops where agents spawn sub-agents, evaluate each other's work, and iterate — all against paid API endpoints — can consume tokens at a rate that makes a single Claude Code session look modest.

These frameworks are not built by AI companies (mostly). But they run on AI company APIs. Every multi-agent harness session flows through API billing.

Why This Incentive Alignment Matters

The companies building these products are not being cynical. Agents genuinely deliver more value than chat for complex tasks. The incentive alignment — "more value = more tokens = more revenue" — is cleaner than it looks on the surface.

But it has a few downstream effects worth understanding:

1. Token efficiency is not a priority. When burning fewer tokens would mean worse economics, the pressure to optimize prompt efficiency is weak. You will notice that agent frameworks tend to be verbose — large system prompts, long observation strings, extensive context inclusion. Some of this is necessary. Some of it is the absence of incentive to trim it.

2. "Try the agent first" is not neutral advice. When the company building the agent also earns more when you use the agent, recommendations to "use Codex for every task you do on your computer" or "run this in a /loop" are not disinterested. The advice may be correct. But its source has aligned financial interests.

3. The subscription pricing blunts awareness. A chat message and a 6-hour loop session are both "using Claude Code." The subscription price is the same. The actual compute consumed differs by 1,000x. Flat-rate pricing makes heavy agentic usage feel free — until you hit limits, at which point you upgrade to a higher plan or pay overages. This is the intended ratchet.

4. Scheduling and automation are the ceiling. Alex Finn's step 6 — "for repeating tasks, schedule them in an automation" — is where the economics become most interesting. Scheduled recurring agent jobs that run daily or hourly, entirely unattended, are the equivalent of a server process that burns tokens continuously. This is not hypothetical. It is already being built by power users, and it represents the most predictable recurring revenue a token-pricing model can generate.

Is Agentic Work Actually Worth the Token Cost?

Yes, often. The math on "Alex Finn launched a landing page in 5 minutes" is favorable if his time is worth more than the API cost. The math on "loop engineering ran overnight and completed a migration that would have taken three days" is almost always favorable.

The question is not whether agents are worth using. It is whether every task you do on your computer is worth routing through an agent harness first — which is what the most aggressive advice suggests.

The answer for most tasks: probably not. Agents are 10x-100x more expensive per token than chat. For tasks where that cost is justified (complex multi-step workflows, repeated tasks, computer-use automations), the ROI is clear. For tasks where a chat response would do (quick questions, simple lookups, single-file edits), routing through an agent harness is burning money on overhead.

The heuristic: Use agents for tasks with multiple steps across multiple contexts. Use chat for tasks with a single clear output. Do not let "agents are more powerful" become "agents for everything" — that is expensive in both money and latency, and the company whose pricing you're on benefits whether or not the agent was the right tool.

What Loop Engineering Actually Is (And When It Makes Sense)

Loop engineering is the design practice of building agent workflows as self-correcting feedback loops:

Plan: The agent generates a plan for a task
Execute: The agent runs the plan step by step
Evaluate: The agent checks its own output (tests pass? page renders? API returns 200?)
Correct: If evaluation fails, the agent generates a fix and re-executes
Repeat: Until the evaluation passes or a human intervention threshold is reached

This is genuinely powerful for tasks like:

Automated test-driven development where the loop runs until all tests pass
Migration scripts where the loop runs until the old and new data match
Multi-file refactors where the loop runs until type-checking is clean
Deployment pipelines where the loop runs until the service is healthy

It is overkill for:

Writing a function when you know what the function should do
Answering a question about your codebase
Editing a config file
Single-step tasks with clear, verifiable completion criteria

The AI companies love loop engineering because it is high-value and high-token. Use it when it earns its cost. Measure that cost.

A Practical Guide to Token-Aware AI Use

1. Size the task before picking the tool. Simple question → chat. Multi-step workflow → agent. Recurring automation → scheduled agent with careful prompt optimization to minimize per-run token cost.

2. Watch your usage dashboard. Every major AI platform shows token consumption. Know what your heavy-usage sessions cost, not just whether you're on a subscription.

3. Optimize prompts for agent loops. In a loop that runs 50 iterations, a 10% prompt reduction is 50% savings over the session. This matters for overnight runs.

4. Treat "use the agent for everything" advice with appropriate skepticism. It may be correct for your workflow. It is also what the company giving the advice earns more from.

5. Verify agentic outputs. Agents burn tokens producing work you should check. A loop that self-corrects 20 times on a wrong approach burns 20x the tokens of a loop that gets it right in one pass. Providing good initial context is as valuable as any loop design.

The Bigger Picture

AI companies are building toward a world where software tasks — all of them — route through AI agents. That world is probably coming. The productivity gains from having an agent that can operate your entire computer on a prompt are real.

But in that world, every task you delegate becomes a billable event. The economics of knowledge work shift from "how long did this take me?" to "how many tokens did this consume?" That is a new cost structure, and it has a new set of beneficiaries.

The companies building the agents are also the companies collecting the token fees. That is not a reason to avoid agents. It is a reason to use them deliberately — knowing what they cost, when they earn their cost, and when a simpler interaction would do the job just as well.

Why Every AI Company Wants You Using Agents: The Token Economics Nobody Talks About

How AI Companies Actually Make Money

The Token Gap Between Chat and Agents

The Products Pushing You Toward More Tokens

Why This Incentive Alignment Matters

Is Agentic Work Actually Worth the Token Cost?

What Loop Engineering Actually Is (And When It Makes Sense)

A Practical Guide to Token-Aware AI Use

The Bigger Picture

Related Reading

Related posts

Goal mode for AI agents: what it is, how to use it, and why OpenClaw, Hermes, and Codex are all adopting it in 2026

Codex vs Claude Code: The Developer Verdict (June 2026)

How to Build Your First Agent Loop: A Step-by-Step Guide (2026)