systemd Production Service Patterns: Practical Configuration and Troubleshooting¶

Target Audience

Intermediate administrators operating systemd services in production environments (with basic systemctl command knowledge)

Key Points¶

Prevention configuration for frequent dependency errors in production
Implementation of stable operations with memory/CPU limits
Construction of automatic recovery and log management patterns during failures

Why This Problem is Critical Now¶

In production environments, service outages, memory leaks, and dependency errors directly impact business continuity. While basic systemd service creation is understood, many struggle to resolve actual operational challenges like "startup order issues between services," "failures due to resource exhaustion," and "complex log management."

Solution Steps Overview¶

Step	Content	Success Criteria
1	Explicit dependency configuration	Order guarantee with other services
2	Resource limitation implementation	Memory/CPU usage control
3	Auto-recovery and log management setup	Automatic recovery behavior during failures

Step 1: Explicit Dependency Configuration¶

Implementation using typical web application + database pattern:

[Unit]
Description=Web Application Service
After=network.target postgresql.service
Wants=postgresql.service
Requires=network.target

[Service]
Type=forking
User=webapp
Group=webapp
ExecStart=/opt/webapp/bin/start.sh
ExecStop=/opt/webapp/bin/stop.sh
PIDFile=/var/run/webapp/webapp.pid
Restart=on-failure
RestartSec=10

[Install]
WantedBy=multi-user.target

Key Configuration Explanation: - After=: Start after specified service (order guarantee) - Wants=: Attempt to start dependency service (continue on failure) - Requires=: Essential dependency (stop this service if dependency fails)

Step 2: Resource Limitation Implementation¶

Limit settings to prevent memory leaks and CPU overuse:

[Unit]
Description=Resource-Limited Web Service
After=network.target

[Service]
Type=simple
User=webapp
ExecStart=/opt/webapp/app
Restart=always
RestartSec=5

# Resource limits
MemoryLimit=512M
CPUQuota=50%
TasksMax=100

# Security hardening
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ReadWritePaths=/var/log/webapp /var/lib/webapp

[Install]
WantedBy=multi-user.target

Effect: Automatic kill when exceeding 512MB memory ensures overall system stability.

Step 3: Auto-Recovery and Log Management Setup¶

Automatic recovery during failures and structured log output:

[Unit]
Description=Self-Healing API Service
After=network.target

[Service]
Type=simple
User=apiuser
ExecStart=/usr/local/bin/api-server
StandardOutput=journal
StandardError=journal
SyslogIdentifier=api-service

# Auto-recovery configuration
Restart=on-failure
RestartSec=10
StartLimitIntervalSec=300
StartLimitBurst=5

# Environment variables for log level control
Environment=LOG_LEVEL=INFO
Environment=LOG_FORMAT=json

[Install]
WantedBy=multi-user.target

Log Check Commands:

# Real-time monitoring
journalctl -u api-service -f

# Structured log search
journalctl -u api-service -o json-pretty | grep ERROR

Common Pitfalls and Solutions¶

Symptom	Cause	Immediate Solution
Service startup failure	Dependency service not started	Add dependency to `After=`
Killed due to memory shortage	Resource limits not set	Add `MemoryLimit=` setting
Logs not found	Output to stdout	Add `StandardOutput=journal`

Advanced Operational Patterns (For Large-Scale Environments)

### Multi-Instance Management

# /etc/systemd/system/webapp@.service
[Unit]
Description=Web App Instance %i
After=network.target

[Service]
Type=simple
User=webapp
ExecStart=/opt/webapp/start.sh %i
Environment=INSTANCE_ID=%i
PrivateTmp=yes

[Install]
WantedBy=multi-user.target

**Usage**:

# Start individual instances
systemctl start webapp@1.service webapp@2.service
systemctl enable webapp@{1..3}.service

### Conditional Startup (Environment-specific Configuration)

[Unit]
Description=Production Only Service
ConditionPathExists=/etc/production.flag
ConditionKernelCommandLine=!rescue

[Service]
ExecStart=/opt/service/prod-service