Claude Code Auto Mode Complete Guide — How the Classifier Changes Auto-Approval¶
For / Key Points
For: Engineers who understand Claude Code's permission model and have felt the limitations of --dangerously-skip-permissions
Key Points:
- Auto mode uses a dedicated classifier model (Sonnet 4.6) to pre-screen every tool call, auto-executing only safe actions
- Unlike
--dangerously-skip-permissions, prompt injection defenses are structurally built in - This is a Research Preview — sandboxed environments are still recommended
March 2026 Status
| Component | Status |
|---|---|
| Auto mode | 2026-03-24 Research Preview (Team plan first) |
| Supported models | Claude Sonnet 4.6 / Opus 4.6 |
| Classifier model | Claude Sonnet 4.6 (fixed) |
| Enterprise / API | Rolling out within days |
Why Auto Mode Was Needed¶
Claude Code's default is conservative. It asks for approval on every file write and Bash command, making it impossible to delegate a large task and walk away.
On the other end, --dangerously-skip-permissions skips all checks. Outside sandboxed environments, bulk file deletion and secret exfiltration can execute without any guard.
Auto mode sits between these two extremes. It reduces approval prompts while limiting risk more than a full skip — a third option by design1.
So how does this middle ground actually work?
How It Works — The Classifier Pre-Screens Actions¶
When auto mode is enabled, a classifier model intercepts every tool call before execution. Claude Sonnet 4.6 acts as a dedicated screener.
User instruction → Claude generates tool call
→ Classifier (Sonnet 4.6) screens it
→ Safe → Auto-execute
→ Risky → Block → Claude explores alternatives
→ Repeated blocks → User gets approval prompt
The classifier receives the user message and tool call content as input. Critically, Claude's text output and tool execution results are not passed to the classifier2. This structurally prevents malicious instructions embedded in files or web pages from manipulating the classifier through tool results.
The classifier also references CLAUDE.md content2. Project-specific instructions like "this directory contains production data" are factored into its safety assessment, enabling context-aware evaluation per project.
Classifier Decision Logic¶
What Gets Blocked, What Gets Through¶
The classifier does not use syntactic pattern matching like settings.json allow/deny rules. Instead, it evaluates natural-language-based block/allow rules contextually2. It screens actions across three dimensions:
- Task scope deviation — Is this action beyond what the user actually requested?
- Target trustworthiness — Is it trying to operate on infrastructure the classifier cannot verify as trusted?
- Intent shift — Are there signs that Claude was influenced by content it read?
By default, the working directory and (for git repos) configured remotes are treated as trusted2.
Default Block Targets¶
The official documentation lists the following as default block targets for the classifier1:
- Bulk file deletion
- Sensitive data exfiltration
- Malicious code execution
The difference from pattern matching is clear. Rather than catching rm -rf / as a string, it evaluates at the intent level: "bulk deletion the user didn't request."
Subagent Behavior — Three-Stage Screening¶
Auto mode applies to Subagents as well, with screening at three stages: spawn, execution, and completion2.
At spawn: The classifier screens the task description delegated to the Subagent. Dangerous tasks like "delete all remote branches" are blocked before the Subagent launches.
During execution: Each tool call within the Subagent is independently screened using the same block/allow rules as the parent session. The permissionMode defined in the Subagent's frontmatter is ignored.
At completion: The Subagent's full action history is reviewed. This detects cases where a Subagent that was safe at spawn time became contaminated by content read during execution. If issues are found, a security warning is attached to the Subagent's return result.
Each stage catches different risks: spawn catches explicitly dangerous tasks, execution catches individual operation risks, and completion catches contamination during execution. This design maintains Subagent autonomy while preserving safety nets.
Setup Instructions¶
CLI¶
# Enable for the first time (one-time)
claude --enable-auto-mode
# Toggle modes in-session with Shift+Tab
# normal → accept edits → plan → auto
To specify directly at startup:
claude --permission-mode auto
VS Code / Desktop¶
Enable auto mode in Settings → Claude Code, then select it from the permission mode dropdown in your session1.
On the Desktop app, it's disabled by default. Enable it explicitly from Organization Settings → Claude Code.
Organization-Level Control¶
{
"disableAutoMode": "disable"
}
Setting this in managed settings disables auto mode across CLI and VS Code extensions1.
Comparison with Existing Permission Methods¶
| Method | Approval Frequency | Safety | Prompt Injection Defense | Environment |
|---|---|---|---|---|
| Normal mode | Every time | Highest (human reviews all) | Depends on human judgment | All |
| Accept edits | Edits auto, Bash needs approval | High | Depends on human judgment | Trusted projects |
| Auto mode | Classifier decides auto/block | Medium-High | Classifier provides structural defense | Sandboxed recommended |
--dangerously-skip-permissions | None | Low | None | Sandboxed required |
In one sentence: more autonomous than accept edits, safer than --dangerously-skip-permissions.
Impact on Existing Workflows¶
With auto mode's arrival, the workflow outlined in the Auto-Permission 3-Step Guide gets an update.
Mode switching: Shift+Tab now cycles through four modes with auto mode added.
--dangerously-skip-permissions: Outside fully automated CI/CD pipelines, auto mode serves as a superior alternative in many cases. However, since auto mode is still a Research Preview, the traditional flag offers more predictability for production CI where stability is paramount.
Allow/deny lists: The classifier and settings.json deny/allow rules work together. Rules are evaluated in addition to classifier decisions, making them effective as defense-in-depth.
Hooks: These operate independently of auto mode. PreToolUse hooks still apply after classifier decisions, enabling more robust guardrails when combined.
Limitations and Caveats¶
Auto mode mitigates risk but does not eliminate it. Understand these limitations before adopting.
- Classifier false positives/negatives — When user intent is ambiguous or environment context is insufficient, dangerous actions may pass through. Safe actions may also be incorrectly blocked
- Sandboxed environment still recommended — Using containers or VMs remains the recommendation1
- Performance impact — The classifier intercepts every tool call, adding a slight increase in token consumption, cost, and latency
- Research Preview status — Currently a research preview for Team plans. Improvements are ongoing
Summary¶
Auto mode is not a replacement for human approval — it is an aid. The classifier adds a new safety layer that expands your options, but ultimate responsibility remains with the user.
The adoption decision is straightforward: if accept edits gives you too many prompts and --dangerously-skip-permissions carries too much risk, auto mode in a sandboxed environment is worth trying.
Related Articles¶
- Auto-Permission 3-Step Guide — Foundational workflow before auto mode. Allow/deny lists and Hooks remain effective
- Claude Code Hooks Complete Guide — Reinforce auto mode's safety net with PreToolUse hooks
- Advanced Best Practices 2026 Edition — Combining with Subagents and context management