Fix: Docker Compose Healthcheck Not Working — depends_on Not Waiting or Always Unhealthy
Quick Answer
How to fix Docker Compose healthcheck issues — depends_on condition service_healthy, healthcheck command syntax, start_period, custom health scripts, and debugging unhealthy containers.
The Problem
A service starts before its dependency is ready, despite depends_on being configured:
services:
app:
depends_on:
- db # App starts before DB is accepting connections
db:
image: postgres:16Or depends_on with condition: service_healthy causes the dependent service to never start:
services:
app:
depends_on:
db:
condition: service_healthy # App waits forever — DB stays 'starting'
db:
image: postgres:16
# No healthcheck defined — condition never satisfiedOr a healthcheck is defined but the container shows (unhealthy) despite the service being fine:
docker ps
# CONTAINER STATUS
# my-db Up 2 minutes (unhealthy)Why This Happens
Docker’s depends_on by default only waits for the container to start, not for the service inside it to be ready. Common failures:
depends_onwithoutcondition— the defaultcondition: service_startedmeans “wait for the container to start,” not “wait for the database to accept connections.” Your app may start while Postgres is still initializing.- No
healthcheckdefined —condition: service_healthyrequires an explicithealthcheckblock on the dependency. Without one, the container never transitions fromstartingtohealthy. - Wrong healthcheck command — if the health command uses a binary not available in the container, it fails immediately with exit code 1 or 127.
start_periodtoo short — Postgres, MySQL, and other databases take several seconds (sometimes 30+) to initialize on first boot. If the health check runs during this window, it fails and the container is marked unhealthy.
Fix 1: Add a Healthcheck to the Dependency
condition: service_healthy only works when the dependency has a healthcheck:
services:
app:
image: myapp:latest
depends_on:
db:
condition: service_healthy # Wait until db is healthy
redis:
condition: service_healthy
environment:
DATABASE_URL: postgresql://user:pass@db:5432/mydb
db:
image: postgres:16
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: pass
POSTGRES_DB: mydb
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user -d mydb"]
interval: 5s # Check every 5 seconds
timeout: 5s # Fail if no response within 5 seconds
retries: 5 # Mark unhealthy after 5 consecutive failures
start_period: 10s # Don't count failures during first 10s (init time)
redis:
image: redis:7-alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 3
start_period: 5sFix 2: Healthcheck Commands for Common Services
Correct health commands for popular services:
# PostgreSQL
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-postgres} -d ${POSTGRES_DB:-postgres}"]
interval: 5s
timeout: 5s
retries: 5
start_period: 15s
# MySQL / MariaDB
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-p${MYSQL_ROOT_PASSWORD}"]
interval: 5s
timeout: 5s
retries: 5
start_period: 30s # MySQL takes longer to initialize
# MongoDB
healthcheck:
test: ["CMD", "mongosh", "--eval", "db.adminCommand('ping')"]
interval: 5s
timeout: 5s
retries: 5
start_period: 20s
# Redis
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 3
# RabbitMQ
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "ping"]
interval: 10s
timeout: 10s
retries: 5
start_period: 30s
# Elasticsearch
healthcheck:
test: ["CMD-SHELL", "curl -fs http://localhost:9200/_cluster/health | grep -vq '\"status\":\"red\"'"]
interval: 10s
timeout: 10s
retries: 5
start_period: 60s
# Custom HTTP service
healthcheck:
test: ["CMD-SHELL", "curl -fs http://localhost:8080/health || exit 1"]
interval: 10s
timeout: 5s
retries: 3
start_period: 20sCMD vs CMD-SHELL syntax:
# CMD — exec form, no shell, each word is a separate array element
test: ["CMD", "pg_isready", "-U", "postgres"]
# CMD-SHELL — runs via /bin/sh -c, supports shell features
test: ["CMD-SHELL", "pg_isready -U postgres && echo healthy"]
# String form — equivalent to CMD-SHELL
test: "pg_isready -U postgres"Note: Prefer
CMDoverCMD-SHELLwhen possible — it avoids shell injection and is more reliable. UseCMD-SHELLonly when you need shell features like&&, pipes, or variable expansion.
Fix 3: Tune start_period for Slow Services
start_period prevents failures during initialization from counting toward retries:
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 5s # How often to run the check
timeout: 5s # How long to wait for each check
retries: 5 # Failures after start_period before marking unhealthy
start_period: 30s # Grace period — failures here don't countWhen to increase start_period:
# Postgres with large schemas / initial data — 30s+
# MySQL with InnoDB recovery — 60s+
# Elasticsearch with large indices — 60-120s
# Kafka/Zookeeper cluster — 60s+
# Services with slow JVM startup (Spring Boot) — 30-60s
# For development, longer start_period reduces false unhealthy states
start_period: 60s
# For production CI, shorter start_period catches real problems faster
start_period: 10sFix 4: Debug Unhealthy Containers
When a container shows (unhealthy), inspect the health check output:
# See health status and last check output
docker inspect --format='{{json .State.Health}}' my-db | python -m json.tool
# Example output:
# {
# "Status": "unhealthy",
# "FailingStreak": 3,
# "Log": [
# {
# "Start": "2026-03-26T10:00:00Z",
# "End": "2026-03-26T10:00:05Z",
# "ExitCode": 1,
# "Output": "pg_isready: error: could not connect to server: FATAL: password authentication failed for user \"postgres\""
# }
# ]
# }
# Or use docker events to watch health transitions
docker events --filter "type=container" --filter "event=health_status"Run the health check command manually inside the container:
# Connect to the container and run the health check manually
docker exec my-db pg_isready -U postgres -d mydb
# Run the exact CMD from the healthcheck
docker exec my-db sh -c "pg_isready -U postgres -d mydb"
# Check if the command exists in the container
docker exec my-db which pg_isready
docker exec my-db which redis-cliCommon unhealthy causes by exit code:
Exit 0 — healthy
Exit 1 — health check failed (service not ready)
Exit 127 — command not found (binary doesn't exist in container)
Exit 124 — timeout exceededFix 5: Healthcheck in Custom Application Images
Add healthcheck to your own Dockerfiles:
# Dockerfile — add HEALTHCHECK instruction
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000
# Install curl for the health check (alpine doesn't have it by default)
RUN apk add --no-cache curl
HEALTHCHECK --interval=10s --timeout=5s --start-period=15s --retries=3 \
CMD curl -fs http://localhost:3000/health || exit 1
CMD ["node", "server.js"]# Python / FastAPI
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
HEALTHCHECK --interval=10s --timeout=5s --start-period=20s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health')" || exit 1
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]Implement a /health endpoint in your app:
// Express health endpoint
app.get('/health', (req, res) => {
// Check critical dependencies
const healthy = db.isConnected() && redis.isReady();
if (healthy) {
res.json({ status: 'ok' });
} else {
res.status(503).json({ status: 'unhealthy', reason: 'dependency unavailable' });
}
});Fix 6: Full Example with All Conditions
A production-ready docker-compose.yml with proper health checks:
version: '3.8'
services:
app:
build: .
ports:
- "3000:3000"
environment:
DATABASE_URL: postgresql://user:pass@db:5432/appdb
REDIS_URL: redis://redis:6379
depends_on:
db:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "curl -fs http://localhost:3000/health || exit 1"]
interval: 10s
timeout: 5s
retries: 3
start_period: 20s
db:
image: postgres:16-alpine
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: pass
POSTGRES_DB: appdb
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user -d appdb"]
interval: 5s
timeout: 5s
retries: 5
start_period: 15s
restart: unless-stopped
redis:
image: redis:7-alpine
volumes:
- redis_data:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 3
start_period: 5s
restart: unless-stopped
volumes:
postgres_data:
redis_data:Still Not Working?
depends_on is ignored by docker compose up --no-deps — the --no-deps flag skips dependency resolution. Remove it if you need health check waiting.
Service marked healthy but app still fails — depends_on: condition: service_healthy only ensures the container’s health check passes. Your app may still need a retry loop for the actual connection, since TCP acceptance and application readiness aren’t always the same. Add retry logic in your app’s startup code.
Healthcheck passes but container restarts — the process exiting (crash) is separate from the health check. A container can be healthy but still restart if the main process crashes. Check docker logs for the crash reason.
Health check works locally but fails in CI — CI environments often have less CPU/memory, causing services to start slower. Increase start_period and retries in CI, or use a separate docker-compose.ci.yml with longer timeouts.
For related Docker issues, see Fix: Docker Container Keeps Restarting and Fix: Docker Compose depends_on Not Working.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Docker Secrets Not Working — BuildKit --secret Not Mounting, Compose Secrets Undefined, or Secret Leaking into Image
How to fix Docker secrets — BuildKit secret mounts in Dockerfile, docker-compose secrets config, runtime vs build-time secrets, environment variable alternatives, and verifying secrets don't leak into image layers.
Fix: docker-compose.override.yml Not Working — Override File Ignored or Not Merged
How to fix docker-compose.override.yml not being applied — file naming, merge behavior, explicit file flags, environment-specific configs, and common override pitfalls.
Fix: Docker Build ARG Not Available — ENV Variables Missing at Runtime
How to fix Docker ARG and ENV variable issues — build-time vs runtime scope, ARG before FROM, multi-stage build variable passing, secret handling, and .env file patterns.
Fix: Docker HEALTHCHECK Failing — Container Marked Unhealthy Despite Running
How to fix Docker HEALTHCHECK failures — command syntax, curl vs wget availability, start period, interval tuning, health check in docker-compose, and debugging unhealthy containers.