What Is Loop Engineering? From Writing Prompts to Designing Agent Loops¶

-- Autonomous agent work is only as good as its checks and stopping conditions

For / Key Points

For: Developers and technical leads using Claude Code or Codex who need to move beyond one-off prompting into autonomous, scheduled, or event-driven agent workflows.

Key Points:

Loop engineering means designing the system that prompts, checks, remembers, and re-runs AI agents instead of typing every next prompt yourself.
In June 2026, Peter Steinberger's X post sparked the discussion, and Addy Osmani framed the discipline as five building blocks plus memory.
The hard part is not autonomy itself. It is verification, stopping conditions, and Human in the Loop escalation.

On June 8, 2026, a two-sentence X post from Peter Steinberger spread across the AI coding community. The message was simple: developers should stop merely prompting coding agents and start designing the loops that prompt those agents instead¹.

This article answers one question. What does loop engineering actually design, and why do verification and stopping conditions matter more than the loop itself?

The Spark Was a Two-Sentence Post¶

The trigger was short, but the reaction was large. ExplainX reported that Steinberger's post reached 6.5 million views and dominated discussion around AI coding agents². In the same context, Boris Cherny, who leads Claude Code, described no longer prompting Claude directly. Instead, he writes loops that prompt Claude and decide what to do next³.

Addy Osmani then organized this discussion under the name "Loop Engineering"⁴. His framing is direct: replace yourself as the person who prompts the agent, and design the system that does it instead. Japanese explainers and field reports appeared quickly after that, which helped the term settle into local AI development discourse⁵⁶.

The point is not that a new buzzword appeared. The real shift is that the human's repeated "what next?" judgment is moving into a designed control system.

Definition: Replace the Human Prompter With a System¶

Loop engineering is a role shift. In the old pattern, a human gave an instruction, read the agent's answer, then typed the next instruction. The human was inside the loop. Loop engineering places a small system in that seat.

Osmani describes a loop as a recursive goal⁴. Once the purpose is defined, the AI iterates toward completion. It finds work, assigns it, checks the result, records what happened, and decides the next step. The system keeps poking the agent instead of relying on a human to do it every turn.

MAKE A CHANGE breaks the design into six elements⁶.

Element	Design Question
Trigger	What starts the loop: a schedule, an event, or something else?
Context	What information does the agent receive?
Action	What is the agent allowed to do?
Verification	How is success checked?
Memory	Where are results and lessons recorded?
Escalation	When does the loop return control to a human?

"Do something useful" is not a loop. A real loop includes success criteria and stopping conditions, because that is what makes repeated agent work governable.

The Lineage: Prompt, Context, Harness, Loop¶

Loop engineering did not appear from nowhere. Over the last few years, the design target has moved from a single prompt to the execution environment around the model.

Japanese coverage summarizes the progression this way⁵.

Layer	Period	Design Target
Prompt engineering	Until around 2024	The quality of one exchange
Context engineering	2025	The full token environment the model sees
Harness engineering	Early 2026	The execution environment around one agent
Loop engineering	June 2026 onward	The system that repeatedly drives the harness

Osmani also places loop engineering one floor above harness engineering⁴. If a harness is the control structure around one acting agent, a loop adds timing, helper agents, memory, and repeated self-feeding work. The layers do not replace each other. They stack.

The question therefore moves away from better wording. It becomes: when does the system start, what can it see, what can it change, and what proves it should stop?

It Is Not Just Scheduled Execution¶

Loop engineering includes scheduled execution. Running Claude Code or Codex every five minutes, every hour, or every morning is part of the pattern⁶. But schedule alone is not enough.

Triggers can be event-driven. A Slack message, Gmail arrival, GitHub pull request update, or Stripe payment can all start the loop⁶. Claude Code documents /loop, cron-style scheduled tasks, cloud routines, and desktop scheduled tasks as scheduling options⁷. Codex provides Automations that let users choose the project, prompt, cadence, and execution environment for a recurring task⁸.

The decisive difference is the stopping condition. Claude Code's /goal keeps a session working until a completion condition is met, with a small fast model evaluating that condition after each turn⁹. Codex has a same-named /goal primitive for long-running work toward a verifiable stopping condition¹⁰.

This is where trust is won or lost. A loop needs something that can say no: a test, type check, real error, review queue, or budget limit². Without that pressure, the loop can become an agent agreeing with itself at high speed.

Five Building Blocks, Plus Memory¶

Osmani organizes the loop into five building blocks plus memory⁴. The notable change is that these pieces are moving from hand-rolled scripts into product features across Codex and Claude Code.

Building Block	Role in the Loop	Codex	Claude Code
Automations	Scheduled discovery and triage	Automations, `/goal`	Scheduled tasks, `/loop`, `/goal`, hooks
Worktrees	Isolate parallel agent work	App worktrees	`git worktree`, `--worktree`, subagent `isolation: worktree`
Skills	Write down project knowledge	Agent Skills (`SKILL.md`)	Skills (`SKILL.md`)
Plugins / Connectors	Connect to external tools	MCP-based Connectors and Plugins	MCP servers and Plugins
Sub-agents	Separate makers from checkers	TOML definitions in `.codex/agents/`	`.claude/agents/`, agent teams

The sixth component is memory. It can be a Markdown file, a Linear board, or another external state store. What matters is that it survives outside one conversation and tracks what is done and what comes next. Models forget between runs, so durable loop state belongs on disk or in an external system⁴.

Sub-agents are especially important. Codex supports custom agents defined as TOML files under .codex/agents/¹¹. Claude Code supports Markdown subagent definitions under .claude/agents/ and can isolate them in worktrees¹². Both provide a way to split the maker from the checker.

The model that wrote the code is often too generous when grading it. A second agent, with different instructions, permissions, or even a different model, can catch failures the first one rationalized away. That separation matters most when the loop runs unattended.

What One Morning Loop Looks Like¶

When assembled, the pieces become a small work system. In Osmani's example, a morning automation reads repository state, CI failures, open issues, and recent commits⁴. It writes worthwhile findings to a state file or ticketing system, then creates a worktree for each item.

One sub-agent drafts a fix. Another sub-agent checks the draft against project skills and existing tests. Connectors open pull requests and update tickets. Anything uncertain stays in a triage inbox for a human.

The human did not type each next prompt. The human designed the loop once. At the heartbeat level, a morning triage trigger can look as simple as this:

0 7 * * * claude -p "$(cat triage-loop.md)" >> loop-state.log

That line is not the real system. The real system is the permission model, logs, checks, stopping conditions, and guardrails around it.

129 Successful Deletions and 43 Runaway Commits¶

Japanese field reports already show both sides of the pattern. MAKE A CHANGE runs several loops in production-like internal workflows, including Slack-to-Notion task capture, Notion task execution, and AI information gathering⁶.

One success case was clear. A loop that watched remote repositories and deleted unnecessary branches removed 129 stale branches automatically⁶. The scope was narrow and the deletion criteria were relatively easy to verify.

The failure case was equally useful. A pull request babysitter loop created 43 commits in one day, but its scope expanded until it changed unrelated areas and drifted away from the original PR purpose. Nearly all of the output was rejected⁶.

The lesson is compact.

A poorly designed loop can mass-produce waste at high speed.
Token use and cost can grow quickly.
A loop without completion criteria, verification, and Human in the Loop escalation is not production-ready.

The more capable the loop, the larger the blast radius. Before asking whether it can run automatically, decide where it must stop.

What Loops Do Not Replace¶

Osmani names three problems that become sharper as loops improve⁴.

First, verification still belongs to humans. An unattended loop can also make unattended mistakes. Even with a separate verifier agent, "done" is a claim, not a proof.

Second, comprehension can decay. The faster code you did not write ships, the wider the gap becomes between what exists and what you understand. A smooth loop accelerates this debt unless someone reads and owns the output.

Third, there is cognitive surrender. Once loops run, it becomes tempting to stop having an opinion and accept whatever they return. The same loop can accelerate someone who understands the work deeply, or help someone avoid understanding it at all.

Loop design is therefore not easier than prompt engineering. The leverage point has moved from prompt wording to the design of verifiable work.

Enterprise Adoption Questions¶

In personal projects, starting with a small loop is often enough. In enterprise environments, unattended actors that commit code, open pull requests, and update tickets become part of change management.

Four questions matter early.

Change management: Loop-generated changes need audit trails and approval flows, just like CI/CD pipeline actions.
Permission separation: Loops should not reuse human credentials. Use service accounts, least privilege, and centralized logs.
Documented agreement: Files such as SKILL.md and AGENTS.md become team operating agreements because the loop reads them every run.
Staged rollout: Start with read-only loops, then grant write access only after checks can say no.

A loop without verification is not just inefficient in an enterprise setting. It is a source of unreviewed changes, budget surprises, and audit risk.

Summary: Place Responsibility Before the Loop¶

Loop engineering is a new design layer, as of June 2026, stacked on top of prompt, context, and harness engineering. The human role moves from typing instructions to designing the system that drives agents.

This is not only a productivity story. The responsibility to define verification, stopping conditions, and ownership gets heavier. Automations, Worktrees, Skills, Connectors, and Sub-agents are increasingly available in products.

The remaining constraint is operational discipline. Building a loop is becoming easy. The difficult part is deciding, before it runs, what responsibility the human will continue to hold.

Claude Methodology Guide