Skip to content

Codex Orchestration: How to Make the Parent Thread Decompose, Delegate, and Consolidate

Complete Guide to Codex CLI

For / Key Points

For: DevOps and AI engineers who want to run multiple streams of work in Codex App or CLI. This assumes basic experience using Codex in a single session.

Key Points:

  • Codex does not autonomously create worktrees from configuration alone; it decomposes, delegates, and consolidates only when the parent thread is explicitly instructed to do so.
  • Parallelization helps when exploration noise, large changes, or independent workstreams should be kept out of the main context.
  • [agents] settings and xhigh are not orchestration switches. They control limits, permissions, and reasoning intensity.

When you see Codex running multiple tasks and creating worktrees, it is tempting to assume there is an automatic orchestration setting somewhere. That assumption sends you in the wrong direction. The actual trigger is not a config flag, but the instruction you give to the parent thread at the start.

This article answers one question. What must you explicitly tell Codex so it acts as a parent orchestrator that decomposes the work, delegates independent parts to child threads or worktrees, and consolidates the results at the end?

The Core Rule

Orchestration starts from instruction, not from configuration.

Multiple threads appearing and worktrees being created is not the result of enabling a single config option. You need to tell the parent thread to decompose the task, move independent work into separate threads or worktrees, and consolidate the final result. OpenAI's Codex docs state that Codex only spawns subagents when the user explicitly asks it to do so.1

Current Codex releases enable subagent workflows by default. Still, the actual trigger is explicit instruction.1 Looking for a setting that turns on autonomous worktree creation is therefore the wrong abstraction. The practical move is to give the parent thread a delegation policy.

The causal flow is simple. A human gives the parent thread the goal and delegation policy. Codex decomposes the task. Independent work moves to separate background threads or worktrees. Child threads execute their scoped work. The parent consolidates the results and reports back.

The human does not need to manually assign "this goes to Worktree A" and "that goes to Worktree B." Let the parent Codex session handle decomposition and routing. The human should provide the objective, constraints, and prohibited actions.

Two Different Behaviors

There are two related but distinct behaviors: subagent workflows and Codex App background threads with worktrees.

They look similar on screen. If you mix them up, you may expect App-only worktree behavior from the CLI, or you may get a CLI subagent workflow when you expected isolated worktrees.

The first behavior is the Subagent workflow. The parent session spawns child agents to research, implement, or review in parallel, then combines their results into one response. Codex ships with built-in default, worker, and explorer agents.1

This works across the CLI and App surfaces. The triggering language is direct: spawn agents, delegate in parallel, or use one agent per point. The purpose is to split viewpoints and keep the parent thread receiving summaries rather than raw intermediate output.

The second behavior is Codex App background threads with Worktree mode. This creates a separate persistent thread backed by a Git worktree. The official Codex App docs show an explicit request pattern: Create a separate background thread in a worktree for this project....2

The App treats threads as project-scoped units. You can ask Codex to find related threads, continue existing ones, pin them, or archive them.2 When creating a new thread manually, you can also choose Local or Worktree, so manual selection and parent-thread delegation can coexist.

When Parallel Work Helps

For short, single-domain tasks, a single thread is often cheaper and more accurate.

Parallel agents and worktree delegation help when the task is long, spans multiple domains, and produces a lot of noisy exploration. The intuition that "a single thread preserves accuracy" is correct for small tasks. It starts to break down on long ones.

The reason is context degradation. The official subagent concept page explains that if the main conversation is flooded with exploration notes, test logs, stack traces, and command output, the session can become less reliable over time.3 The page frames this as context pollution and context rot.3

The failure mode is straightforward. The main context fills up with output unrelated to the original requirements. The initial constraints lose influence. The implementation drifts away from the original design. This is a common failure pattern in long single-thread sessions.

Parallelization is not only about speed. It is also about moving noisy exploration, testing, and log analysis out of the parent thread so the parent can stay focused on requirements, decisions, and final outputs.3 In that sense, parallelization is a context-preservation technique.

It is not free. Each subagent performs its own model and tool work, so subagent workflows consume more tokens than comparable single-agent runs.1 If extra activity becomes the goal, cost rises without improving accuracy.

Work typeRecommended modeReason
Small single-domain changesSingle threadThe work fits in context and delegation overhead rarely pays off
Codebase exploration and large reviewsSubagent workflowRead-heavy noise can be isolated from the parent
Independent UI, API, and DB implementationWorktree delegationSeparate edit scopes are easier to validate and merge

The practical threshold is one question. If you run this whole task in one thread, will logs and trial-and-error bury the original instructions? If yes, move the exploration or independent implementation elsewhere. If no, stay single-threaded.

Parent Thread Prompt

The first message should define the parent role, delegation conditions, consolidation requirements, and prohibited actions.

You do not need a long custom prompt every time. The following template is enough to set the expectation that the parent thread owns decomposition and delegation.

Act as the parent orchestrator for this project.

First decompose the task. Where workstreams are independent, create separate
background threads in Codex-managed worktrees for this project. Do not ask me
to manually choose the worktrees.

Keep this parent thread focused on orchestration and final consolidation.
Child threads should handle exploration, implementation, tests, or review as
appropriate.

Limit to 3 child threads. Do not let multiple child threads edit the same files.
Wait for all child threads, then summarize changed files, commands run, test
results, risks, and recommended merge order.

Pin important child threads and archive dead ends. Do not push, open PRs,
install dependencies, access the network, or touch files outside this project
without approval.

The critical part is the explicit delegation sentence. Where workstreams are independent, create separate background threads in Codex-managed worktrees for this project. follows the same pattern as the official background-thread example.2

"Do this in a worktree" is weaker. It tends to mean one task in one worktree. The orchestration request is different: act as the parent, decompose the whole task, delegate only independent work, and consolidate the results.

What Configuration Does

Configuration is not a magic switch for autonomous worktree creation. It is a guardrail for limits, permissions, and reasoning effort.

If the parent may spawn child agents, define concurrency, nesting, approvals, and sandboxing up front. For a project-local .codex/config.toml, this is a practical starting point.

model = "gpt-5.5"
model_reasoning_effort = "high"

approval_policy = "on-request"
sandbox_mode = "workspace-write"

[agents]
max_threads = 4
max_depth = 1

agents.max_threads caps the number of concurrently open agent threads, and the default is 6 when unset.1 The example lowers it to 4 to avoid excessive fan-out. agents.max_depth caps recursive spawning depth; the default of 1 allows direct children while preventing deep recursion.4

approval_policy = "on-request" keeps risky operations behind approval. sandbox_mode = "workspace-write" keeps work scoped to the workspace. These matter more when multiple child agents are active.

What xhigh Actually Means

Setting xhigh does not create child threads.

xhigh is the high end of model_reasoning_effort. The Codex Configuration Reference lists minimal | low | medium | high | xhigh as possible values for model_reasoning_effort.4 That setting is separate from agent spawning.

xhigh is also model-dependent. As of May 31, 2026, the official OpenAI model list shows gpt-5.5, gpt-5.4, and gpt-5.4-mini supporting none low medium high xhigh.5 That does not mean every model or older CLI build accepts the same value.

When assigning a lighter model to a custom agent, consider cost and latency, not only support. For example, using medium with gpt-5.4-mini can be a deliberate choice for an exploration agent. It is not because gpt-5.4-mini lacks xhigh in the current official model list.

There is also a configuration caveat. A public GitHub issue in the Codex repository reports that Codex App automations can run at medium even when the global model_reasoning_effort is set to xhigh.6 Treat this as an open bug report as of May 31, 2026, not as a finalized official specification.

Fixing Child Agent Roles

If you want repeatable behavior, define custom agents under .codex/agents/.

Each TOML file defines one agent. The official docs say name, description, and developer_instructions are required. Optional fields such as model, model_reasoning_effort, and sandbox_mode inherit from the parent session when omitted.1

name = "code_mapper"
description = "Read-only explorer that maps relevant files and execution paths before implementation."
model = "gpt-5.4-mini"
model_reasoning_effort = "medium"
sandbox_mode = "read-only"
developer_instructions = """
Stay read-only.
Identify relevant files, entry points, execution paths, and risk areas.
Return concise findings with file references. Do not edit files.
"""

After defining it, tell the parent to use a read-only subagent for mapping and final review, and to reserve background worktree threads for independent implementation. In practice, a stable setup often needs only three roles: explorer, worker, and reviewer.

Do not over-split roles. Too many roles increase the parent's consolidation burden and raise the human review cost.

Limits and Safety Conditions

For team use, pin independence, approvals, and merge order in the initial instruction.

Codex does not silently create child sessions without being asked. Spawning requires explicit instruction.1 At the same time, the App can manage local project and worktree threads, and it supports explicit requests for separate background threads.2

Parallel execution assumes independent work. Do not let multiple child threads edit the same files. Keep push, PR creation, dependency installation, network access, and out-of-repository access behind approval. Put those boundaries in the first instruction.

Do not treat consolidation as an afterthought. Creating worktrees is the easy part. The outputs still need to return to one coherent diff. Require the parent thread to report changed files, commands run, test results, remaining risks, and recommended merge order.

Codex orchestration is not magic that happens in the background. It works when the parent thread is explicitly assigned responsibility for decomposition, delegation, and consolidation. Configuration narrows that behavior into a safer operating envelope.

Summary

The main rule is to define the parent thread's responsibility before tuning configuration. The parent owns decomposition, delegation, waiting, consolidation, and reporting. Child threads should receive bounded tasks such as exploration, implementation, tests, or review.

For team use, treat the startup prompt as a short operating contract. Define when delegation is allowed, which files must not overlap, which actions require approval, and what the parent must report at the end. With those boundaries in place, Codex parallelization protects context as much as it accelerates work.