Skip to content

Claude Code Complete Guide

Claude Code Auto Mode Complete Guide — How the Classifier Changes Auto-Approval

For / Key Points

For: Engineers who understand Claude Code's permission model and have felt the limitations of --dangerously-skip-permissions

Key Points:

  • Auto mode uses a dedicated classifier model (Sonnet 4.6) to pre-screen every tool call, auto-executing only safe actions
  • Unlike --dangerously-skip-permissions, prompt injection defenses are structurally built in
  • This is a Research Preview — sandboxed environments are still recommended

March 2026 Status

ComponentStatus
Auto mode2026-03-24 Research Preview (Team plan first)
Supported modelsClaude Sonnet 4.6 / Opus 4.6
Classifier modelClaude Sonnet 4.6 (fixed)
Enterprise / APIRolling out within days

Why Auto Mode Was Needed

Claude Code's default is conservative. It asks for approval on every file write and Bash command, making it impossible to delegate a large task and walk away.

On the other end, --dangerously-skip-permissions skips all checks. Outside sandboxed environments, bulk file deletion and secret exfiltration can execute without any guard.

Auto mode sits between these two extremes. It reduces approval prompts while limiting risk more than a full skip — a third option by design1.

So how does this middle ground actually work?

How It Works — The Classifier Pre-Screens Actions

When auto mode is enabled, a classifier model intercepts every tool call before execution. Claude Sonnet 4.6 acts as a dedicated screener.

User instruction → Claude generates tool call
  → Classifier (Sonnet 4.6) screens it
    → Safe → Auto-execute
    → Risky → Block → Claude explores alternatives
      → Repeated blocks → User gets approval prompt

The classifier receives the user message and tool call content as input. Critically, Claude's text output and tool execution results are not passed to the classifier2. This structurally prevents malicious instructions embedded in files or web pages from manipulating the classifier through tool results.

The classifier also references CLAUDE.md content2. Project-specific instructions like "this directory contains production data" are factored into its safety assessment, enabling context-aware evaluation per project.

Classifier Decision Logic

What Gets Blocked, What Gets Through

The classifier does not use syntactic pattern matching like settings.json allow/deny rules. Instead, it evaluates natural-language-based block/allow rules contextually2. It screens actions across three dimensions:

  • Task scope deviation — Is this action beyond what the user actually requested?
  • Target trustworthiness — Is it trying to operate on infrastructure the classifier cannot verify as trusted?
  • Intent shift — Are there signs that Claude was influenced by content it read?

By default, the working directory and (for git repos) configured remotes are treated as trusted2.

Default Block Targets

The official documentation lists the following as default block targets for the classifier1:

  • Bulk file deletion
  • Sensitive data exfiltration
  • Malicious code execution

The difference from pattern matching is clear. Rather than catching rm -rf / as a string, it evaluates at the intent level: "bulk deletion the user didn't request."

Subagent Behavior — Three-Stage Screening

Auto mode applies to Subagents as well, with screening at three stages: spawn, execution, and completion2.

At spawn: The classifier screens the task description delegated to the Subagent. Dangerous tasks like "delete all remote branches" are blocked before the Subagent launches.

During execution: Each tool call within the Subagent is independently screened using the same block/allow rules as the parent session. The permissionMode defined in the Subagent's frontmatter is ignored.

At completion: The Subagent's full action history is reviewed. This detects cases where a Subagent that was safe at spawn time became contaminated by content read during execution. If issues are found, a security warning is attached to the Subagent's return result.

Each stage catches different risks: spawn catches explicitly dangerous tasks, execution catches individual operation risks, and completion catches contamination during execution. This design maintains Subagent autonomy while preserving safety nets.

Setup Instructions

CLI

# Enable for the first time (one-time)
claude --enable-auto-mode

# Toggle modes in-session with Shift+Tab
# normal → accept edits → plan → auto

To specify directly at startup:

claude --permission-mode auto

VS Code / Desktop

Enable auto mode in Settings → Claude Code, then select it from the permission mode dropdown in your session1.

On the Desktop app, it's disabled by default. Enable it explicitly from Organization Settings → Claude Code.

Organization-Level Control

{
  "disableAutoMode": "disable"
}

Setting this in managed settings disables auto mode across CLI and VS Code extensions1.

Comparison with Existing Permission Methods

MethodApproval FrequencySafetyPrompt Injection DefenseEnvironment
Normal modeEvery timeHighest (human reviews all)Depends on human judgmentAll
Accept editsEdits auto, Bash needs approvalHighDepends on human judgmentTrusted projects
Auto modeClassifier decides auto/blockMedium-HighClassifier provides structural defenseSandboxed recommended
--dangerously-skip-permissionsNoneLowNoneSandboxed required

In one sentence: more autonomous than accept edits, safer than --dangerously-skip-permissions.

Impact on Existing Workflows

With auto mode's arrival, the workflow outlined in the Auto-Permission 3-Step Guide gets an update.

Mode switching: Shift+Tab now cycles through four modes with auto mode added.

--dangerously-skip-permissions: Outside fully automated CI/CD pipelines, auto mode serves as a superior alternative in many cases. However, since auto mode is still a Research Preview, the traditional flag offers more predictability for production CI where stability is paramount.

Allow/deny lists: The classifier and settings.json deny/allow rules work together. Rules are evaluated in addition to classifier decisions, making them effective as defense-in-depth.

Hooks: These operate independently of auto mode. PreToolUse hooks still apply after classifier decisions, enabling more robust guardrails when combined.

Limitations and Caveats

Auto mode mitigates risk but does not eliminate it. Understand these limitations before adopting.

  • Classifier false positives/negatives — When user intent is ambiguous or environment context is insufficient, dangerous actions may pass through. Safe actions may also be incorrectly blocked
  • Sandboxed environment still recommended — Using containers or VMs remains the recommendation1
  • Performance impact — The classifier intercepts every tool call, adding a slight increase in token consumption, cost, and latency
  • Research Preview status — Currently a research preview for Team plans. Improvements are ongoing

Summary

Auto mode is not a replacement for human approval — it is an aid. The classifier adds a new safety layer that expands your options, but ultimate responsibility remains with the user.

The adoption decision is straightforward: if accept edits gives you too many prompts and --dangerously-skip-permissions carries too much risk, auto mode in a sandboxed environment is worth trying.


  1. Anthropic, "Auto mode for Claude Code", 2026-03-24. https://claude.com/blog/auto-mode 

  2. Anthropic, "Choose a permission mode", Claude Code Docs. https://code.claude.com/docs/en/permission-modes