Skip to content

Fix: GitHub Actions Job Timeout — Workflow Cancelled or Stuck After 6 Hours

FixDevs ·

Quick Answer

How to fix GitHub Actions timeout issues — job-level and step-level timeouts, stuck processes, self-hosted runner timeouts, debugging hanging jobs, and timeout best practices.

The Problem

A GitHub Actions workflow is cancelled with a timeout error:

Error: The operation was canceled.
The job running on runner GitHub Actions X has exceeded the maximum execution time of 360 minutes.

Or a specific step hangs indefinitely without producing output:

Run npm test
  npm test
  shell: /usr/bin/bash -e {0}
... (no output for 45 minutes, then cancelled)

Or a self-hosted runner job times out much sooner than expected:

Error: The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.

Or a workflow that previously finished in 5 minutes now takes hours.

Why This Happens

GitHub Actions has hard limits and several common reasons for jobs getting stuck:

  • Default timeout is 360 minutes — GitHub’s default job timeout is 6 hours. Without an explicit timeout, a hung job consumes runner minutes until the maximum is hit.
  • Interactive prompts waiting for input — a CLI tool that asks “Are you sure? (y/n)” will hang forever in CI. Common culprits: npm/pip install prompts, git push with no credentials, database migration confirmations.
  • Test suite with open handles — Jest, Mocha, and similar test runners sometimes don’t exit when async operations (open database connections, HTTP servers, timers) keep the process alive.
  • Deadlock in parallel jobs — two jobs waiting on each other, or a job waiting for a resource that was never created.
  • Self-hosted runner crashes or disconnects — if the runner process dies mid-job, GitHub re-queues the job, and the timeout clock restarts.

Fix 1: Set Explicit Timeouts

Set timeout-minutes at the job or step level to fail fast instead of hanging:

jobs:
  build:
    runs-on: ubuntu-latest
    timeout-minutes: 15  # Job-level: cancel if not done in 15 minutes

    steps:
      - uses: actions/checkout@v4

      - name: Install dependencies
        timeout-minutes: 5   # Step-level: fail this step after 5 minutes
        run: npm ci

      - name: Run tests
        timeout-minutes: 10
        run: npm test

      - name: Deploy
        timeout-minutes: 5
        run: ./deploy.sh

Recommended timeout strategy:

# Set a job timeout that's 20-30% longer than the expected duration
# Set step timeouts for known slow steps
# Fail fast — a 10-minute timeout on a 3-minute job is reasonable

jobs:
  test:
    timeout-minutes: 20  # Normally finishes in 8-12 minutes

  deploy:
    timeout-minutes: 10  # Deployment should take < 5 minutes
    needs: test

Fix 2: Fix Test Suites That Don’t Exit

The most common cause of hanging CI jobs is a test runner that doesn’t exit after tests complete:

# Jest — use --forceExit as a fallback (but fix the root cause)
- name: Run tests
  run: npx jest --forceExit

# Or set a timeout on the test run itself
- name: Run tests
  run: timeout 300 npm test  # Linux: kill after 300 seconds

Root cause fix — close open handles:

// jest.config.js
module.exports = {
  // Detect open handles so you can fix them properly
  detectOpenHandles: true,

  // Or force exit if you can't fix all handles immediately
  forceExit: true,
};
// Common open handle fixes in test files
// Database connections
afterAll(async () => {
  await db.close();          // Close DB connection after all tests
  await server.close();      // Close HTTP server
  clearTimeout(myTimer);     // Clear pending timers
  subscription.unsubscribe(); // Unsubscribe from event streams
});

Mocha:

# --exit forces Mocha to quit after tests, even with open handles
npx mocha --exit tests/**/*.test.js

# --timeout sets per-test timeout
npx mocha --timeout 10000 tests/**/*.test.js

Fix 3: Prevent Interactive Prompts in CI

Tools that ask for confirmation will hang forever in CI. Always pass non-interactive flags:

steps:
  - name: Install Python packages
    run: pip install --no-input -r requirements.txt
    # --no-input: never prompt for confirmation

  - name: Install npm packages
    run: npm ci
    # npm ci is already non-interactive; npm install may prompt

  - name: Run database migrations
    run: |
      # Django — no interactive prompts
      python manage.py migrate --no-input

      # Rails
      RAILS_ENV=production bundle exec rails db:migrate

      # Flyway
      flyway migrate -url=$DB_URL -user=$DB_USER -password=$DB_PASS

  - name: Deploy with Terraform
    env:
      TF_INPUT: "false"  # Disable all Terraform interactive prompts
    run: terraform apply -auto-approve

  - name: Docker build
    run: |
      # --no-cache avoids prompts and stale cache
      docker build --no-cache -t myapp .

Detect hanging jobs by checking for output:

- name: Run tests with heartbeat
  run: |
    # Run tests in background, print progress every 30s
    npm test &
    TEST_PID=$!
    while kill -0 $TEST_PID 2>/dev/null; do
      echo "Still running at $(date)..."
      sleep 30
    done
    wait $TEST_PID

Fix 4: Debug Hanging Jobs with tmate

Connect to a running GitHub Actions runner to debug interactively:

- name: Setup tmate session for debugging
  uses: mxschmitt/action-tmate@v3
  if: ${{ failure() }}  # Only open session on failure
  with:
    limit-access-to-actor: true  # Only the repo owner can connect
    timeout-minutes: 15

Or add a conditional debug step:

- name: Debug
  uses: mxschmitt/action-tmate@v3
  if: ${{ github.event_name == 'workflow_dispatch' && inputs.debug_enabled }}
  timeout-minutes: 30

Log more context before the hanging step:

- name: Pre-test diagnostics
  run: |
    echo "=== System Info ==="
    free -h
    df -h
    ps aux | head -20

    echo "=== Network ==="
    netstat -tlnp 2>/dev/null || ss -tlnp

    echo "=== Environment ==="
    env | grep -v -E "(TOKEN|SECRET|PASSWORD|KEY)"

Fix 5: Configure Self-Hosted Runner Timeouts

Self-hosted runners have different timeout behavior and common failure modes:

# Increase timeout for self-hosted runners (they're often slower)
jobs:
  build:
    runs-on: self-hosted
    timeout-minutes: 60  # Self-hosted runners can be slower

    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          # Shallow clone for faster checkout on self-hosted
          fetch-depth: 1

Runner configuration for reliability:

# Run the runner as a service so it restarts automatically
# (instead of running it manually)

# On Linux with systemd
cd ~/actions-runner
sudo ./svc.sh install
sudo ./svc.sh start

# Check runner status
sudo ./svc.sh status

# The runner will restart automatically if it crashes

Clean up stale files between runs on self-hosted runners:

jobs:
  build:
    runs-on: self-hosted
    steps:
      - name: Clean workspace
        run: |
          # Remove files from previous runs that may cause issues
          git clean -fdx || true
          docker system prune -f || true

Fix 6: Optimize Slow Workflows

If the workflow finishes but takes too long, optimize before hitting timeouts:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # Cache dependencies to avoid re-downloading
      - name: Cache node modules
        uses: actions/cache@v4
        with:
          path: ~/.npm
          key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
          restore-keys: |
            ${{ runner.os }}-node-

      - run: npm ci

      # Run tests in parallel
      - name: Run tests
        run: npx jest --maxWorkers=4 --runInBand=false

  # Split long test suites across parallel jobs
  test-unit:
    runs-on: ubuntu-latest
    steps:
      - run: npm test -- --testPathPattern="unit"

  test-integration:
    runs-on: ubuntu-latest
    steps:
      - run: npm test -- --testPathPattern="integration"

Conditional steps — skip expensive work when not needed:

- name: Build Docker image
  # Only build on main branch pushes — skip for PRs
  if: github.ref == 'refs/heads/main'
  run: docker build -t myapp .

- name: Run E2E tests
  # Only run if source files changed
  if: contains(github.event.head_commit.modified, 'src/')
  run: npx playwright test

Still Not Working?

Job is queued but never starts — check if you’ve hit your concurrent job limit. Free GitHub accounts are limited to 20 concurrent jobs. If all runners are busy, new jobs wait in the queue. Check the Actions tab for queued jobs.

The operation was canceled immediately — a required secret or environment variable is missing, causing an early exit. Or a dependency job failed and the needs: condition cancelled this job. Check the job that ran before.

Step timeout doesn’t stop the process cleanly — when a step times out, GitHub sends SIGTERM followed by SIGKILL. If the process catches SIGTERM but doesn’t exit, it gets killed after a grace period. Some processes spawn children that aren’t killed. Use timeout with --kill-after:

# Send SIGTERM at 300s, SIGKILL at 330s if still running
timeout --kill-after=30s 300s npm test

Workflow dispatch with workflow_call timeout — when a workflow is called from another workflow (workflow_call), the called workflow’s own timeout-minutes applies to the entire called workflow, not the individual jobs within it. Set timeouts at both levels.

For related GitHub Actions issues, see Fix: GitHub Actions Process Completed Exit Code 1 and Fix: GitHub Actions Cache Not Working.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles