Skip to content

How to start Codex automation where humans only review PRs

For / Key Points

For: Beginners who want Codex to reduce repeated work, but do not want to delegate auto-merge or production release decisions.

Key Points:

  • Start with one boundary: Codex may create a PR, but humans merge it
  • Define permissions, review gates, and stop conditions before scheduling work
  • A PR-first workflow gives you automation without removing human judgment

If a person repeats the same dependency update, article draft, or small cleanup every morning, Codex can probably help. That does not mean Codex should change, test, merge, and publish everything on day one. The question for this guide is simple: how do you delegate the work while keeping the final decision inside pull request review?

OpenAI's Codex Automations documentation says automations can start fresh runs on a schedule and report results in Triage, and Git repositories can run either in the local project or in a dedicated worktree1. That makes Codex a good fit for bringing a prepared PR candidate back to the team.


Draw the first line at "PR only"

The first line should be explicit: Codex creates the pull request, and a human decides whether to merge it.

For example, imagine a daily broken-link cleanup for documentation. Codex can detect the issue, edit the file, run the allowed checks, and open a PR. The human reviewer checks the diff, CI, PR body, and blast radius. Most of the routine work is automated, but the publishing decision stays in the review screen.

This boundary makes the workflow easier for beginners to trust. If something goes wrong, it lands in a pull request instead of directly on the main branch. You can stop, ask for a fix, or close the PR using normal GitHub habits.

The next step is to write the automation prompt as acceptance criteria, not just as a task request.

Write acceptance criteria into the prompt

Codex works better when the prompt defines what must be true before a PR is acceptable.

"Fix the article" is vague. "Only edit the target files, preserve the JP/EN pair, and skip PR creation if the quality score is below 80" gives Codex a stopping rule. For automation, stopping rules prevent more incidents than success descriptions do.

A beginner prompt should include three things.

  • Scope: directories, files, and maximum candidate count Codex may touch
  • Checks: tests, quality gates, and CI signals Codex must run
  • Stop conditions: duplicates, weak evidence, missing credentials, or branch conflicts

Codex's GitHub integration supports requesting a review with @codex review, and Codex can follow review guidance stored in the repository's AGENTS.md files2. That means the automation prompt is not the only place for rules. Your repository can also define what Codex should treat as important.

At this point, Codex is less like an unattended deployer and more like an operator that prepares reviewable PR candidates.

Narrow permissions before optimizing speed

For PR-first automation, narrower permissions are easier to operate than broad permissions.

The Codex Automations docs note that automations use the default sandbox settings. In read-only mode, file changes, network access, and app interactions that require modification are blocked; full access carries elevated risk1.

The approvals and security docs show workspace-write with on-request as a common setup. Codex can edit inside the workspace, but asks for approval for activity beyond that boundary3.

Beginners should usually start with a small permission surface.

  • Limit edit paths: begin with places such as docs/blog/, where impact is visible
  • Name external operations: be explicit about GitHub, network access, and secrets
  • Block destructive work: avoid history rewrites, force pushes, and broad deletion tasks

You can widen permissions after the workflow succeeds several times. The first goal is not maximum speed. It is a workflow that stops cleanly when something is wrong.

Review how the automation stops

When reviewing a Codex-made PR, inspect more than the diff. Check whether the automation stopped in the right places.

Did it open a PR even though the topic duplicated an existing page? Did it mark the run successful even though the quality gate failed? Did it finish without waiting for CI? These questions matter more for recurring automation than for a one-off human PR, because the same mistake can repeat tomorrow.

Keep the review checklist short enough to use every day.

  • Input: today's candidate, target files, and source evidence are appropriate
  • Process: conflict checks, quality gates, and CI checks actually ran
  • Output: PR body, logs, and skip reasons help the next run make a better decision

Codex code review for GitHub reads the PR diff, follows repository guidance, and posts a standard GitHub review focused on serious issues2. If you combine human review with Codex review, give them different jobs.

Humans decide whether the change should merge. Codex provides another pass for risks that are easy to miss.

Summary: make the PR the control point

The first useful Codex automation is not a fully automatic publishing line. It is a small loop that reads the same input each day, makes a change only when the conditions are met, and opens a PR only when the result is reviewable.

In that loop, the pull request is not just a formality. It is the control point. Weak evidence, duplicate work, low quality, or failing CI should stop there. When those failures stop before merge, Codex becomes a tool for preparing review-ready work instead of a risky auto-merge system.

The next improvement is to log both successful PRs and skipped candidates. Automation quality improves not only from the diffs that merged, but also from the reasons a run correctly stopped.