Skip to content

systemd Production Service Patterns: Practical Configuration and Troubleshooting

Target Audience

  • Intermediate administrators operating systemd services in production environments (with basic systemctl command knowledge)

Key Points

  1. Prevention configuration for frequent dependency errors in production
  2. Implementation of stable operations with memory/CPU limits
  3. Construction of automatic recovery and log management patterns during failures

Why This Problem is Critical Now

In production environments, service outages, memory leaks, and dependency errors directly impact business continuity. While basic systemd service creation is understood, many struggle to resolve actual operational challenges like "startup order issues between services," "failures due to resource exhaustion," and "complex log management."

Solution Steps Overview

StepContentSuccess Criteria
1Explicit dependency configurationOrder guarantee with other services
2Resource limitation implementationMemory/CPU usage control
3Auto-recovery and log management setupAutomatic recovery behavior during failures

Step 1: Explicit Dependency Configuration

Implementation using typical web application + database pattern:

[Unit]
Description=Web Application Service
After=network.target postgresql.service
Wants=postgresql.service
Requires=network.target

[Service]
Type=forking
User=webapp
Group=webapp
ExecStart=/opt/webapp/bin/start.sh
ExecStop=/opt/webapp/bin/stop.sh
PIDFile=/var/run/webapp/webapp.pid
Restart=on-failure
RestartSec=10

[Install]
WantedBy=multi-user.target

Key Configuration Explanation: - After=: Start after specified service (order guarantee) - Wants=: Attempt to start dependency service (continue on failure) - Requires=: Essential dependency (stop this service if dependency fails)

Step 2: Resource Limitation Implementation

Limit settings to prevent memory leaks and CPU overuse:

[Unit]
Description=Resource-Limited Web Service
After=network.target

[Service]
Type=simple
User=webapp
ExecStart=/opt/webapp/app
Restart=always
RestartSec=5

# Resource limits
MemoryLimit=512M
CPUQuota=50%
TasksMax=100

# Security hardening
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ReadWritePaths=/var/log/webapp /var/lib/webapp

[Install]
WantedBy=multi-user.target

Effect: Automatic kill when exceeding 512MB memory ensures overall system stability.

Step 3: Auto-Recovery and Log Management Setup

Automatic recovery during failures and structured log output:

[Unit]
Description=Self-Healing API Service
After=network.target

[Service]
Type=simple
User=apiuser
ExecStart=/usr/local/bin/api-server
StandardOutput=journal
StandardError=journal
SyslogIdentifier=api-service

# Auto-recovery configuration
Restart=on-failure
RestartSec=10
StartLimitIntervalSec=300
StartLimitBurst=5

# Environment variables for log level control
Environment=LOG_LEVEL=INFO
Environment=LOG_FORMAT=json

[Install]
WantedBy=multi-user.target

Log Check Commands:

# Real-time monitoring
journalctl -u api-service -f

# Structured log search
journalctl -u api-service -o json-pretty | grep ERROR

Common Pitfalls and Solutions

SymptomCauseImmediate Solution
Service startup failureDependency service not startedAdd dependency to After=
Killed due to memory shortageResource limits not setAdd MemoryLimit= setting
Logs not foundOutput to stdoutAdd StandardOutput=journal
Advanced Operational Patterns (For Large-Scale Environments) ### Multi-Instance Management
# /etc/systemd/system/webapp@.service
[Unit]
Description=Web App Instance %i
After=network.target

[Service]
Type=simple
User=webapp
ExecStart=/opt/webapp/start.sh %i
Environment=INSTANCE_ID=%i
PrivateTmp=yes

[Install]
WantedBy=multi-user.target
**Usage**:
# Start individual instances
systemctl start webapp@1.service webapp@2.service
systemctl enable webapp@{1..3}.service
### Conditional Startup (Environment-specific Configuration)
[Unit]
Description=Production Only Service
ConditionPathExists=/etc/production.flag
ConditionKernelCommandLine=!rescue

[Service]
ExecStart=/opt/service/prod-service