๐ AI Agent Production Deployment Implementation Guide August 2025 - Enterprise-Ready Practical Architecture¶
๐ Table of Contents¶
- Production Architecture Design
- Claude Code Auto-Approval Setup
- GitHub Actions CI/CD Integration
- cron Automation System
- Error Handling & Monitoring
- Security & Permission Management
- Scaling & Performance Optimization
- Troubleshooting
๐๏ธ Production Architecture Design¶
Overall System Architecture¶
graph TD
A[GitHub Repository] --> B[GitHub Actions]
B --> C[Claude Code Agent]
C --> D[Auto-Approval System]
D --> E[Execution Monitoring]
E --> F[Slack/Teams Notifications]
G[cron] --> H[Scheduled Execution Trigger]
H --> C
I[Error Logs] --> J[Monitoring Dashboard]
J --> K[Alert Notifications]๐ฏ Design Principles¶
Enterprise Compliance Principles
- Fail-safe: Implement automatic recovery features for all processes
- Audit Trail: Persistent logging and searchable audit trails for all executions
- Permission Separation: Secure by design following principle of least privilege
- Scalability: Horizontal scaling-ready architecture
โก Claude Code Auto-Approval Setup¶
1. Basic Auto-Approval Configuration¶
# Create Claude Code configuration directory
mkdir -p ~/.claude
cd ~/.claude
# Create auto-approval configuration file
cat > auto_approval_config.json << 'EOF'
{
"auto_approval": {
"enabled": true,
"rules": {
"file_operations": {
"read": true,
"write": true,
"create": true
},
"bash_commands": {
"safe_commands": true,
"package_management": true,
"git_operations": true
},
"web_requests": {
"api_calls": true,
"documentation_fetch": true
}
},
"restrictions": {
"system_files": false,
"sensitive_directories": ["/etc", "/root", "/home/*/.ssh"],
"dangerous_commands": ["rm -rf", "format", "dd"]
}
}
}
EOF
# Set permissions
chmod 600 auto_approval_config.json
2. Environment-Specific Configuration Management¶
# Production environment configuration
cat > ~/.claude/production_config.json << 'EOF'
{
"environment": "production",
"auto_approval": {
"enabled": true,
"audit_mode": true,
"restrictions": {
"require_confirmation": [
"database_operations",
"external_api_calls",
"file_deletions"
]
}
},
"logging": {
"level": "INFO",
"audit_trail": true,
"retention_days": 90
}
}
EOF
# Development environment configuration
cat > ~/.claude/development_config.json << 'EOF'
{
"environment": "development",
"auto_approval": {
"enabled": true,
"audit_mode": false
},
"logging": {
"level": "DEBUG",
"retention_days": 30
}
}
EOF
3. Security Whitelist¶
# Allowed commands list
cat > ~/.claude/whitelist_commands.txt << 'EOF'
# Git operations
git status
git add
git commit
git push
git pull
# Package management
npm install
npm run
pip install
yarn install
# File operations
ls
cat
grep
find
mkdir
cp
mv
# System monitoring
ps
top
df
du
netstat
EOF
๐ GitHub Actions CI/CD Integration¶
1. Claude Code Execution Workflow¶
# .github/workflows/claude-agent-automation.yml
name: Claude Code Agent Automation
on:
schedule:
- cron: '0 6 * * *' # Execute daily at 6 AM
workflow_dispatch:
inputs:
task_description:
description: 'Agent task description'
required: true
default: 'Daily maintenance and optimization'
environment:
description: 'Target environment'
required: true
default: 'production'
type: choice
options:
- production
- staging
- development
jobs:
claude-agent-execution:
runs-on: ubuntu-latest
environment: ${{ github.event.inputs.environment || 'production' }}
steps:
- name: Checkout Repository
uses: actions/checkout@v4
with:
token: ${{ secrets.GITHUB_TOKEN }}
fetch-depth: 0
- name: Setup Claude Code
run: |
# Install Claude Code
curl -fsSL https://claude.ai/install.sh | sh
echo "Claude Code installed successfully"
# Deploy configuration files
mkdir -p ~/.claude
echo "${{ secrets.CLAUDE_CONFIG }}" > ~/.claude/config.json
# Authentication setup
echo "${{ secrets.ANTHROPIC_API_KEY }}" > ~/.claude/api_key
chmod 600 ~/.claude/api_key
- name: Execute Claude Agent Task
id: claude_execution
run: |
# Initialize task execution log
echo "๐ Starting Claude Agent execution at $(date)" > execution.log
# Execute Claude Code
claude --auto-approve \
--task "${{ github.event.inputs.task_description || 'Daily maintenance and optimization' }}" \
--environment "${{ github.event.inputs.environment || 'production' }}" \
--output-format json \
2>&1 | tee -a execution.log
# Evaluate execution results
if [ $? -eq 0 ]; then
echo "execution_status=success" >> $GITHUB_OUTPUT
echo "โ
Claude Agent execution completed successfully" >> execution.log
else
echo "execution_status=failed" >> $GITHUB_OUTPUT
echo "โ Claude Agent execution failed" >> execution.log
exit 1
fi
- name: Process Execution Results
if: always()
run: |
# Structure execution logs
echo "## ๐ Execution Summary" > summary.md
echo "- **Status**: ${{ steps.claude_execution.outputs.execution_status }}" >> summary.md
echo "- **Environment**: ${{ github.event.inputs.environment || 'production' }}" >> summary.md
echo "- **Timestamp**: $(date -u)" >> summary.md
echo "- **Commit**: ${{ github.sha }}" >> summary.md
# Save log files as artifacts
mkdir -p artifacts
cp execution.log artifacts/
cp summary.md artifacts/
- name: Upload Execution Artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: claude-execution-logs-${{ github.run_number }}
path: artifacts/
retention-days: 30
- name: Commit Changes
if: steps.claude_execution.outputs.execution_status == 'success'
run: |
git config --local user.email "claude-agent@example.com"
git config --local user.name "Claude Agent"
# Check if there are changes
if [[ $(git status --porcelain) ]]; then
git add .
git commit -m "๐ค Claude Agent automation - ${{ github.event.inputs.task_description || 'Daily maintenance' }}
Environment: ${{ github.event.inputs.environment || 'production' }}
Execution ID: ${{ github.run_number }}
Timestamp: $(date -u)
๐ ๏ธ Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>"
git push
echo "โ
Changes committed and pushed successfully"
else
echo "โน๏ธ No changes to commit"
fi
- name: Notify Results
if: always()
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
channel: '#claude-automation'
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
fields: repo,message,commit,author,action,eventName,ref,workflow
custom_payload: |
{
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "๐ค Claude Agent Execution ${{ steps.claude_execution.outputs.execution_status == 'success' && 'โ
Completed' || 'โ Failed' }}\n*Environment:* ${{ github.event.inputs.environment || 'production' }}\n*Task:* ${{ github.event.inputs.task_description || 'Daily maintenance' }}"
}
}
]
}
2. Environment-Specific Deployment¶
# .github/workflows/environment-specific-deployment.yml
name: Environment-Specific Claude Deployment
on:
push:
branches: [main, develop, staging]
jobs:
determine-environment:
runs-on: ubuntu-latest
outputs:
environment: ${{ steps.env.outputs.environment }}
steps:
- id: env
run: |
if [[ "${{ github.ref }}" == "refs/heads/main" ]]; then
echo "environment=production" >> $GITHUB_OUTPUT
elif [[ "${{ github.ref }}" == "refs/heads/staging" ]]; then
echo "environment=staging" >> $GITHUB_OUTPUT
else
echo "environment=development" >> $GITHUB_OUTPUT
fi
deploy-claude-agent:
needs: determine-environment
runs-on: ubuntu-latest
environment: ${{ needs.determine-environment.outputs.environment }}
steps:
- uses: actions/checkout@v4
- name: Deploy to ${{ needs.determine-environment.outputs.environment }}
run: |
echo "Deploying Claude Agent to ${{ needs.determine-environment.outputs.environment }}"
# Apply environment-specific configuration
case "${{ needs.determine-environment.outputs.environment }}" in
production)
echo "Applying production configuration"
;;
staging)
echo "Applying staging configuration"
;;
development)
echo "Applying development configuration"
;;
esac
โฐ cron Automation System¶
1. System cron Configuration¶
# cron configuration script
cat > /etc/cron.d/claude-agent << 'EOF'
# Claude Agent automatic execution schedule
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
# Daily at 6 AM: Daily maintenance
0 6 * * * claude /opt/claude-agent/scripts/daily-maintenance.sh >> /var/log/claude-agent/daily.log 2>&1
# Hourly: Health check
0 * * * * claude /opt/claude-agent/scripts/health-check.sh >> /var/log/claude-agent/health.log 2>&1
# Every Monday at 1 AM: Weekly report generation
0 1 * * 1 claude /opt/claude-agent/scripts/weekly-report.sh >> /var/log/claude-agent/weekly.log 2>&1
# 1st of every month at 2 AM: Monthly optimization
0 2 1 * * claude /opt/claude-agent/scripts/monthly-optimization.sh >> /var/log/claude-agent/monthly.log 2>&1
EOF
# Set permissions
chmod 644 /etc/cron.d/claude-agent
2. Execution Script Collection¶
# Daily maintenance script
cat > /opt/claude-agent/scripts/daily-maintenance.sh << 'EOF'
#!/bin/bash
set -euo pipefail
# Log configuration
LOG_FILE="/var/log/claude-agent/daily-$(date +%Y%m%d).log"
exec 1> >(tee -a "$LOG_FILE")
exec 2>&1
echo "๐ Starting daily maintenance at $(date)"
# Pre-execution checks for Claude Code
if ! command -v claude &> /dev/null; then
echo "โ Claude Code not found"
exit 1
fi
# Authentication verification
if ! claude auth status &> /dev/null; then
echo "โ Claude authentication failed"
exit 1
fi
# Execute maintenance tasks
claude --auto-approve \
--task "Daily system maintenance:
1. Check system health and performance
2. Update dependencies and packages
3. Clean temporary files and logs
4. Generate system status report
5. Optimize database and cache" \
--timeout 3600 \
--retry 3
echo "โ
Daily maintenance completed at $(date)"
EOF
# Health check script
cat > /opt/claude-agent/scripts/health-check.sh << 'EOF'
#!/bin/bash
set -euo pipefail
LOG_FILE="/var/log/claude-agent/health-$(date +%Y%m%d).log"
exec 1> >(tee -a "$LOG_FILE")
exec 2>&1
echo "๐ Starting health check at $(date)"
# System resource checks
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2+$4}' | cut -d'%' -f1)
MEMORY_USAGE=$(free | grep Mem | awk '{printf("%.1f", $3/$2 * 100.0)}')
DISK_USAGE=$(df -h / | awk 'NR==2{print $5}' | sed 's/%//')
echo "๐ System Resources:"
echo " CPU Usage: ${CPU_USAGE}%"
echo " Memory Usage: ${MEMORY_USAGE}%"
echo " Disk Usage: ${DISK_USAGE}%"
# Alert threshold checks
if (( $(echo "$CPU_USAGE > 80" | bc -l) )); then
echo "โ ๏ธ High CPU usage detected: ${CPU_USAGE}%"
fi
if (( $(echo "$MEMORY_USAGE > 85" | bc -l) )); then
echo "โ ๏ธ High memory usage detected: ${MEMORY_USAGE}%"
fi
if [ "$DISK_USAGE" -gt 90 ]; then
echo "โ ๏ธ High disk usage detected: ${DISK_USAGE}%"
fi
echo "โ
Health check completed at $(date)"
EOF
# Grant execution permissions
chmod +x /opt/claude-agent/scripts/*.sh
# Create log directory
mkdir -p /var/log/claude-agent
chown claude:claude /var/log/claude-agent
3. Advanced Scheduling¶
# systemd timer configuration (more advanced scheduling)
cat > /etc/systemd/system/claude-agent-daily.service << 'EOF'
[Unit]
Description=Claude Agent Daily Maintenance
After=network.target
[Service]
Type=oneshot
User=claude
ExecStart=/opt/claude-agent/scripts/daily-maintenance.sh
StandardOutput=journal
StandardError=journal
EOF
cat > /etc/systemd/system/claude-agent-daily.timer << 'EOF'
[Unit]
Description=Run Claude Agent Daily Maintenance
Requires=claude-agent-daily.service
[Timer]
OnCalendar=daily
Persistent=true
RandomizedDelaySec=300
[Install]
WantedBy=timers.target
EOF
# Enable timer
systemctl enable claude-agent-daily.timer
systemctl start claude-agent-daily.timer
๐ก๏ธ Error Handling & Monitoring¶
1. Error Handling Mechanisms¶
# Common error handling functions
cat > /opt/claude-agent/lib/error-handling.sh << 'EOF'
#!/bin/bash
# Error handling configuration
set -euo pipefail
IFS=$'\n\t'
# Logging functions
log_error() {
echo "โ [ERROR] $(date): $1" >&2
logger -t claude-agent "ERROR: $1"
}
log_warn() {
echo "โ ๏ธ [WARN] $(date): $1" >&2
logger -t claude-agent "WARN: $1"
}
log_info() {
echo "โน๏ธ [INFO] $(date): $1"
logger -t claude-agent "INFO: $1"
}
# Retry functionality
retry_command() {
local max_attempts=$1
local delay=$2
local command="${@:3}"
local attempt=1
while [ $attempt -le $max_attempts ]; do
log_info "Attempt $attempt/$max_attempts: $command"
if eval "$command"; then
log_info "Command succeeded on attempt $attempt"
return 0
else
log_warn "Command failed on attempt $attempt"
if [ $attempt -lt $max_attempts ]; then
log_info "Retrying in $delay seconds..."
sleep $delay
fi
fi
((attempt++))
done
log_error "Command failed after $max_attempts attempts: $command"
return 1
}
# Emergency stop functionality
emergency_stop() {
log_error "Emergency stop triggered: $1"
# Stop running Claude Code processes
pkill -f "claude" || true
# Send alert
send_alert "EMERGENCY" "Claude Agent emergency stop: $1"
exit 1
}
# Alert sending
send_alert() {
local severity=$1
local message=$2
# Slack notification
if [ -n "${SLACK_WEBHOOK:-}" ]; then
curl -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"[$severity] Claude Agent: $message\"}" \
"$SLACK_WEBHOOK" || true
fi
# Email notification
if [ -n "${ALERT_EMAIL:-}" ]; then
echo "$message" | mail -s "[$severity] Claude Agent Alert" "$ALERT_EMAIL" || true
fi
}
# Trap configuration
trap 'log_error "Script interrupted"; exit 130' INT TERM
trap 'log_error "Script failed on line $LINENO"' ERR
EOF
2. Monitoring Dashboard¶
# monitoring/dashboard.py
import json
import datetime
from pathlib import Path
import streamlit as st
import pandas as pd
import plotly.express as px
class ClaudeAgentMonitoring:
def __init__(self):
self.log_dir = Path("/var/log/claude-agent")
def load_execution_logs(self):
"""Load execution logs"""
logs = []
for log_file in self.log_dir.glob("*.log"):
try:
with open(log_file, 'r') as f:
for line in f:
if "Starting" in line or "completed" in line:
logs.append({
'timestamp': self.extract_timestamp(line),
'type': self.extract_type(line),
'status': self.extract_status(line),
'file': log_file.name
})
except Exception as e:
st.error(f"Log file read error: {e}")
return pd.DataFrame(logs)
def extract_timestamp(self, line):
"""Extract timestamp from log"""
# Implementation: Extract timestamp using regex
pass
def create_dashboard(self):
"""Create Streamlit dashboard"""
st.title("๐ค Claude Agent Monitoring Dashboard")
# Display metrics
col1, col2, col3, col4 = st.columns(4)
with col1:
st.metric("Today's Executions", "12", "2")
with col2:
st.metric("Success Rate", "95%", "3%")
with col3:
st.metric("Avg Execution Time", "2m 34s", "-15s")
with col4:
st.metric("Error Rate", "5%", "-2%")
# Execution history graph
df = self.load_execution_logs()
if not df.empty:
fig = px.line(df, x='timestamp', y='status',
title='Claude Agent Execution Status')
st.plotly_chart(fig, use_container_width=True)
# Latest log display
st.subheader("๐ Latest Execution Logs")
if not df.empty:
st.dataframe(df.tail(10), use_container_width=True)
if __name__ == "__main__":
monitor = ClaudeAgentMonitoring()
monitor.create_dashboard()
3. Alert Configuration¶
# monitoring/alerting-rules.yml
groups:
- name: claude-agent-alerts
rules:
- alert: ClaudeAgentExecutionFailed
expr: claude_agent_execution_success_rate < 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "Claude Agent execution success rate decreased"
description: "Claude Agent execution success rate has dropped below 90% in the past 5 minutes"
- alert: ClaudeAgentHighMemoryUsage
expr: claude_agent_memory_usage > 0.85
for: 10m
labels:
severity: critical
annotations:
summary: "Claude Agent high memory usage"
description: "Claude Agent memory usage exceeds 85%"
- alert: ClaudeAgentNotResponding
expr: up{job="claude-agent"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Claude Agent is not responding"
description: "Claude Agent process has not responded for 1 minute"
๐ Security & Permission Management¶
1. IAM Configuration¶
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ClaudeAgentMinimalPermissions",
"Effect": "Allow",
"Principal": {
"User": "arn:aws:iam::ACCOUNT:user/claude-agent"
},
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"cloudwatch:PutMetricData"
],
"Resource": [
"arn:aws:s3:::claude-agent-bucket/*",
"arn:aws:logs:*:ACCOUNT:*:claude-agent*",
"arn:aws:cloudwatch:*:ACCOUNT:metric/ClaudeAgent/*"
]
},
{
"Sid": "DenyDangerousOperations",
"Effect": "Deny",
"Action": [
"iam:*",
"ec2:TerminateInstances",
"rds:DeleteDBInstance",
"s3:DeleteBucket"
],
"Resource": "*"
}
]
}
2. Security Audit¶
# security/audit-script.sh
#!/bin/bash
echo "๐ Starting Claude Agent security audit"
# API key protection check
echo "๐ API Key Protection Status:"
if [ -f ~/.claude/api_key ]; then
PERMISSIONS=$(stat -c "%a" ~/.claude/api_key)
if [ "$PERMISSIONS" = "600" ]; then
echo "โ
API key permissions are properly set (600)"
else
echo "โ API key permissions are inappropriate ($PERMISSIONS)"
echo "Fix: chmod 600 ~/.claude/api_key"
fi
else
echo "โ ๏ธ API key file not found"
fi
# Configuration file permissions check
echo "๐ Configuration File Permissions:"
find ~/.claude -type f -name "*.json" | while read file; do
PERMISSIONS=$(stat -c "%a" "$file")
if [ "$PERMISSIONS" = "600" ] || [ "$PERMISSIONS" = "644" ]; then
echo "โ
$file ($PERMISSIONS)"
else
echo "โ $file ($PERMISSIONS) - Please fix permissions"
fi
done
# Log file permissions check
echo "๐ Log File Permissions:"
if [ -d /var/log/claude-agent ]; then
OWNER=$(stat -c "%U:%G" /var/log/claude-agent)
echo "Log directory owner: $OWNER"
find /var/log/claude-agent -type f | head -5 | while read logfile; do
PERMISSIONS=$(stat -c "%a" "$logfile")
echo "๐ $(basename $logfile): $PERMISSIONS"
done
fi
echo "โ
Security audit completed"
๐ Scaling & Performance Optimization¶
1. Horizontal Scaling Configuration¶
# k8s/claude-agent-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: claude-agent
namespace: ai-automation
spec:
replicas: 3
selector:
matchLabels:
app: claude-agent
template:
metadata:
labels:
app: claude-agent
spec:
containers:
- name: claude-agent
image: claude-agent:latest
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
env:
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: claude-secrets
key: api-key
- name: ENVIRONMENT
value: "production"
volumeMounts:
- name: config
mountPath: /app/config
- name: logs
mountPath: /app/logs
volumes:
- name: config
configMap:
name: claude-agent-config
- name: logs
persistentVolumeClaim:
claimName: claude-agent-logs
---
apiVersion: v1
kind: Service
metadata:
name: claude-agent-service
spec:
selector:
app: claude-agent
ports:
- port: 8080
targetPort: 8080
type: ClusterIP
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: claude-agent-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: claude-agent
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
2. Performance Optimization¶
# optimization/performance-tuner.py
import asyncio
import aiohttp
import time
from concurrent.futures import ThreadPoolExecutor
import logging
class ClaudeAgentOptimizer:
def __init__(self):
self.logger = logging.getLogger(__name__)
self.executor = ThreadPoolExecutor(max_workers=4)
async def optimize_execution(self, tasks):
"""Task parallel execution optimization"""
# Priority-based task classification
high_priority = [t for t in tasks if t.get('priority') == 'high']
normal_priority = [t for t in tasks if t.get('priority') == 'normal']
low_priority = [t for t in tasks if t.get('priority') == 'low']
# Execute high priority tasks first
results = []
if high_priority:
high_results = await self.execute_batch(high_priority, max_concurrent=2)
results.extend(high_results)
# Execute normal priority tasks in parallel
if normal_priority:
normal_results = await self.execute_batch(normal_priority, max_concurrent=4)
results.extend(normal_results)
# Execute low priority tasks in background
if low_priority:
asyncio.create_task(self.execute_batch(low_priority, max_concurrent=2))
return results
async def execute_batch(self, tasks, max_concurrent=3):
"""Batch execution"""
semaphore = asyncio.Semaphore(max_concurrent)
async def execute_single(task):
async with semaphore:
start_time = time.time()
try:
result = await self.execute_claude_task(task)
execution_time = time.time() - start_time
self.logger.info(f"Task {task['id']} completed in {execution_time:.2f}s")
return {
'task_id': task['id'],
'result': result,
'execution_time': execution_time,
'status': 'success'
}
except Exception as e:
execution_time = time.time() - start_time
self.logger.error(f"Task {task['id']} failed: {e}")
return {
'task_id': task['id'],
'error': str(e),
'execution_time': execution_time,
'status': 'failed'
}
# Execute all tasks in parallel
results = await asyncio.gather(*[execute_single(task) for task in tasks])
return results
async def execute_claude_task(self, task):
"""Claude Code task execution"""
# Implementation: Claude Code API call
await asyncio.sleep(1) # Mock execution time
return f"Task {task['id']} completed"
# Performance monitoring
class PerformanceMonitor:
def __init__(self):
self.metrics = {
'execution_times': [],
'success_rate': 0,
'throughput': 0
}
def record_execution(self, execution_time, success):
"""Record execution metrics"""
self.metrics['execution_times'].append(execution_time)
# Calculate success rate
total_executions = len(self.metrics['execution_times'])
if total_executions > 0:
success_count = sum(1 for _ in self.metrics['execution_times'] if success)
self.metrics['success_rate'] = success_count / total_executions
def get_performance_summary(self):
"""Performance summary"""
if not self.metrics['execution_times']:
return "No execution data available"
avg_time = sum(self.metrics['execution_times']) / len(self.metrics['execution_times'])
return {
'average_execution_time': avg_time,
'success_rate': self.metrics['success_rate'],
'total_executions': len(self.metrics['execution_times']),
'throughput_per_minute': 60 / avg_time if avg_time > 0 else 0
}
๐ง Troubleshooting¶
Common Issues and Solutions¶
Authentication Error
Symptoms: Authentication failed error
Solutions:
# Reset API key
claude auth login
# Check permissions
claude auth status
# Verify configuration file
cat ~/.claude/config.json
Auto-approval Not Working
Symptoms: Manual approval required every time
Solutions:
# Check auto-approval configuration
claude config get auto_approval
# Enable configuration
claude config set auto_approval.enabled true
# Check permissions
claude config set auto_approval.rules.file_operations.write true
cron Execution Error
Symptoms: Claude Code execution fails via cron
Solutions:
# Environment variable configuration
export PATH="/usr/local/bin:$PATH"
export ANTHROPIC_API_KEY="your-api-key"
# Create cron wrapper script
cat > /opt/claude-agent/cron-wrapper.sh << 'EOF'
#!/bin/bash
source /etc/environment
source ~/.bashrc
exec /usr/local/bin/claude "$@"
EOF
Debug Information Collection¶
# debug/collect-info.sh
#!/bin/bash
echo "๐ Collecting Claude Agent debug information"
DEBUG_DIR="/tmp/claude-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p "$DEBUG_DIR"
# System information
echo "๐ Collecting system information..."
{
echo "=== System Information ==="
uname -a
echo ""
echo "=== Claude Code Version ==="
claude --version
echo ""
echo "=== Authentication Status ==="
claude auth status
echo ""
echo "=== Configuration ==="
claude config list
echo ""
echo "=== Environment Variables ==="
env | grep -i claude
echo ""
echo "=== Process Information ==="
ps aux | grep claude
echo ""
echo "=== Recent Logs ==="
tail -100 /var/log/claude-agent/*.log 2>/dev/null || echo "No logs found"
} > "$DEBUG_DIR/system-info.txt"
# Configuration file collection
echo "๐ Collecting configuration files..."
cp -r ~/.claude "$DEBUG_DIR/config" 2>/dev/null || echo "Config not found"
# Log file collection
echo "๐ Collecting log files..."
cp -r /var/log/claude-agent "$DEBUG_DIR/logs" 2>/dev/null || echo "Logs not found"
# Create archive
tar -czf "${DEBUG_DIR}.tar.gz" -C "/tmp" "$(basename $DEBUG_DIR)"
echo "โ
Debug information collected: ${DEBUG_DIR}.tar.gz"
echo "Please send this file to the support team"
๐ Success Metrics and KPIs¶
Operational Success Metrics¶
| Metric | Target Value | Measurement Method |
|---|---|---|
| Execution Success Rate | > 95% | GitHub Actions success rate |
| Average Execution Time | < 5 minutes | Log analysis |
| System Uptime | > 99.5% | Monitoring dashboard |
| Error Detection Time | < 1 minute | Alert response time |
| Recovery Time | < 10 minutes | Incident management |
Business Efficiency Improvements¶
- Before Automation: Manual work 8 hours/day
- After Automation: Monitoring work 30 minutes/day
- Efficiency Rate: 93.75%
๐ Related Articles¶
Implementation Completion Checklist
- Auto-approval setup completed
- GitHub Actions CI/CD integration
- cron automation configuration
- Error handling implementation
- Monitoring dashboard construction
- Security audit execution
- Performance optimization
- Production operation launch
By fully implementing this guide, you can build an enterprise-level AI agent automation system. ๐