Fix: AWS Lambda Cold Start Timeout and Slow First Invocation

Q: How do I fix "AWS Lambda Cold Start Timeout and Slow First Invocation"?

How to fix AWS Lambda cold start timeouts and slow first invocations — provisioned concurrency, reducing package size, connection reuse, and language-specific optimizations.

The Error

Your Lambda function works fine on subsequent calls but the first invocation after a period of inactivity fails with a timeout or is significantly slow:

Task timed out after 3.00 seconds

Or in CloudWatch Logs:

REPORT RequestId: abc-123  Duration: 28432.45 ms  Billed Duration: 28433 ms
Init Duration: 26891.23 ms

The Init Duration shows the cold start time — in this case, the initialization took 26 seconds before the handler even ran.

Or users report the first API call after quiet periods takes 5–30 seconds while subsequent calls are fast (under 100ms).

Why This Happens

Lambda functions are not always running. When a function has not been invoked recently, AWS deallocates the execution environment. The next invocation triggers a cold start:

AWS provisions a new execution environment (VM/container).
Downloads and extracts the deployment package.
Starts the language runtime (JVM, .NET CLR, Python interpreter, Node.js).
Executes any initialization code outside the handler function.
Then runs your handler.

Steps 1–4 are the cold start. This can take anywhere from 100ms (small Node.js function) to 30+ seconds (large Java function with Spring Boot).

The CloudWatch REPORT line distinguishes the two phases. Init Duration is everything before your handler runs. Duration is the handler itself. On warm invocations the REPORT line omits Init Duration entirely, which is why grepping for @initDuration is a reliable way to count cold starts. The function timeout you configure applies to the whole invocation; if Init Duration + Duration exceeds the configured timeout, the request is killed with Task timed out. This is why a cold start can fail a request that would have succeeded warm — the budget is gone before your code starts.

Cold starts are more likely when:

The function has not been invoked recently (idle timeout varies by load).
Traffic spikes cause Lambda to provision new instances in parallel.
The deployment package is large.
The runtime is slow to initialize (Java, .NET > Python > Node.js).
The handler initializes heavy resources (DB connections, SDK clients) inside the function handler.

In Production: Incident Lens

Cold starts in production are usually a traffic-shape incident rather than a code defect. Steady traffic keeps a handful of execution environments warm and almost no one sees a cold start. A sudden burst — a marketing email blast, a cron-triggered fan-out, an autoscaling event in an upstream service — forces Lambda to provision dozens or hundreds of new environments in parallel, and every first request to each new environment pays the init cost. If your p99 latency SLO is tight (sub-second), this single class of event can burn an entire month’s error budget in five minutes.

How it surfaces. The classic signature is a sharp p99 latency spike that does not move p50 much. p50 stays clean because most requests still hit warm environments; p99 explodes because the unlucky few hit fresh ones. The CloudWatch metric Duration looks bimodal. IteratorAge (for stream-triggered Lambdas) or upstream queue depth grows because the first batch on each new environment is slow. API Gateway, ALB, or AppSync responses show 504s or 408s for the cold callers while the warm callers are fine. Customer support tickets describe “the first click after lunch took forever” — that is the autoscale-down then autoscale-up cycle, in human language.

Monitoring signals to wire up. Set CloudWatch alarms on InitDuration p99 and on cold-start count derived from CloudWatch Logs Insights (the query in Fix 7 returns this). Watch the Throttles, TooManyRequestsException, and ConcurrentExecutions metrics together — concurrent execution headroom that is too close to the function’s reserved concurrency causes new instances to spin up constantly. For Java functions, watch aws.lambda.RestoreDuration as well; SnapStart restore failures degrade silently to a full cold start, doubling the init cost.

Recovery sequence. During the incident, the only same-shift fix is to add Provisioned Concurrency to the affected alias. Provisioned environments are pre-initialized, so the next wave of traffic hits warm pools immediately. Set the value to roughly your current p95 concurrent executions and increase from there. If you cannot deploy quickly, the next-best lever is to raise memory (which raises CPU proportionally) on the function — this shortens both init and execution time and often costs less per request because billable ms drops faster than memory cost rises. After traffic returns to normal, remove the emergency Provisioned Concurrency or you will pay for idle capacity.

Postmortem preventives. Move all SDK clients, database pools, and large imports out of the handler (Fix 1). Audit deployment package size and shed unused dependencies into layers (Fix 3). For Java, enable SnapStart (Fix 4) and add restore hooks for any stateful resources. Wire scheduled warm pings only for low-traffic functions where Provisioned Concurrency is too expensive — they are a partial mitigation, not a replacement. Finally, treat cold-start budget as an SLI: track the percentage of invocations that include Init Duration and the p99 of Init Duration, and put both on your service dashboard.

Fix 1: Move Initialization Code Outside the Handler

Code outside the handler runs during cold start initialization and is reused across warm invocations. Database connections and SDK clients initialized inside the handler are created on every invocation:

Broken — initializes on every invocation:

// handler.js
exports.handler = async (event) => {
  // This runs on EVERY invocation — creates new connection each time
  const { Pool } = require("pg");
  const pool = new Pool({ connectionString: process.env.DATABASE_URL });

  const result = await pool.query("SELECT * FROM users WHERE id = $1", [event.userId]);
  return result.rows[0];
};

Fixed — initialize once, reuse across invocations:

// handler.js
const { Pool } = require("pg");

// Runs ONCE during cold start — reused by all warm invocations
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 1, // Lambda: keep pool small — each instance handles one request at a time
});

exports.handler = async (event) => {
  // Connection already established — just query
  const result = await pool.query("SELECT * FROM users WHERE id = $1", [event.userId]);
  return result.rows[0];
};

Python example:

import boto3
import os

# Initialize outside handler — runs once per cold start
dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table(os.environ["TABLE_NAME"])

def handler(event, context):
    # table is already initialized — no cold start overhead here
    response = table.get_item(Key={"id": event["id"]})
    return response.get("Item")

Pro Tip: Lambda execution environments are reused for subsequent invocations (warm starts). Any state initialized at the module level persists between warm invocations of the same environment. This is why connection reuse works — and also why you must be careful with mutable global state.

Fix 2: Use Provisioned Concurrency for Latency-Critical Functions

Provisioned Concurrency keeps a specified number of execution environments initialized and ready at all times — eliminating cold starts entirely for those instances:

# Set provisioned concurrency for a function
aws lambda put-provisioned-concurrency-config \
  --function-name my-api-handler \
  --qualifier production \
  --provisioned-concurrent-executions 10

With a published version (recommended):

# Publish a version
aws lambda publish-version --function-name my-api-handler

# Apply provisioned concurrency to the version
aws lambda put-provisioned-concurrency-config \
  --function-name my-api-handler \
  --qualifier 1 \
  --provisioned-concurrent-executions 5

Cost: Provisioned concurrency costs money even when the function is not invoked — you pay for the pre-initialized environments. Calculate the trade-off:

Without provisioned concurrency: cold starts for some requests, lower base cost.
With provisioned concurrency: no cold starts, higher base cost.

For latency-sensitive APIs (user-facing, payment processing), provisioned concurrency is often worth the cost.

Schedule provisioned concurrency with Application Auto Scaling:

# Scale up during business hours, scale down at night
aws application-autoscaling put-scheduled-action \
  --service-namespace lambda \
  --resource-id function:my-api-handler:production \
  --scheduled-action-name scale-up-morning \
  --schedule "cron(0 8 * * ? *)" \
  --scalable-target-action MinCapacity=10,MaxCapacity=10

Fix 3: Reduce Package Size

Lambda downloads and extracts the deployment package on every cold start. Smaller packages = faster cold starts:

Check your current package size:

# After building your deployment package
du -sh deployment.zip
ls -lh deployment.zip

For Node.js — remove dev dependencies and unused packages:

# Only include production dependencies
npm ci --only=production

# Analyze what is in your bundle
npx bundlephobia-cli  # Or use webpack-bundle-analyzer

Use Lambda Layers for shared dependencies:

# Create a layer with large shared libraries (e.g., AWS SDK, express)
aws lambda publish-layer-version \
  --layer-name node-dependencies \
  --zip-file fileb://layer.zip \
  --compatible-runtimes nodejs20.x

Lambda layers are cached separately and shared across functions — they do not add to the function’s download time after the first load.

For Python — use slim base images and avoid heavy packages:

# Check package sizes
pip show pandas numpy scipy  # These are large — consider alternatives or layers

# Use Lambda-optimized packages
pip install numpy --target ./package  # Use AWS-compiled binaries for Lambda

For Java — use GraalVM native image or Quarkus:

Spring Boot cold starts can exceed 10 seconds on Lambda. Alternatives:

Quarkus with native compilation: sub-100ms cold starts.
Micronaut or Helidon: faster initialization than Spring.
GraalVM native image: compile to native binary, ~10-50ms cold start.
AWS Lambda SnapStart (Java only): snapshots the initialized state and restores it.

Fix 4: Enable Lambda SnapStart (Java)

AWS Lambda SnapStart (available for Java 11+ Corretto runtime) takes a snapshot of the initialized execution environment and restores it on subsequent cold starts:

# SAM template
MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: java21
    SnapStart:
      ApplyOn: PublishedVersions
    AutoPublishAlias: live

SnapStart reduces Java cold starts from seconds to milliseconds. It works by:

Initializing the function once.
Taking a memory snapshot.
Restoring from snapshot on cold start (much faster than re-initializing).

Note: SnapStart requires handling restore hooks for resources that need reconnection after snapshot restore (database connections, random seeds):

import com.amazonaws.services.lambda.runtime.events.SnapStartEvent;
import com.amazonaws.services.lambda.runtime.snapstart.SnapStartLifecycleHook;

@Component
public class SnapStartHook implements SnapStartLifecycleHook {
    @Override
    public void beforeCheckpoint(SnapStartEvent event) {
        // Close connections before snapshot
        dataSource.close();
    }

    @Override
    public void afterRestore(SnapStartEvent event) {
        // Reconnect after restore
        dataSource.initialize();
    }
}

Fix 5: Keep Functions Warm with Scheduled Pings

For functions that cannot use provisioned concurrency, a scheduled ping every few minutes keeps at least one instance warm:

CloudWatch Events (EventBridge) scheduled rule:

# Ping the function every 5 minutes
aws events put-rule \
  --name lambda-warmer \
  --schedule-expression "rate(5 minutes)"

aws events put-targets \
  --rule lambda-warmer \
  --targets "Id=1,Arn=arn:aws:lambda:us-east-1:123456789:function:my-function"

Handle warm pings in the function:

exports.handler = async (event) => {
  // Skip actual work for warm-up pings
  if (event.source === "aws.events" && event["detail-type"] === "Scheduled Event") {
    console.log("Warm ping — skipping");
    return { statusCode: 200, body: "warm" };
  }

  // Normal handler logic
  return await processRequest(event);
};

Limitation: Pings only keep ONE instance warm. If you have concurrent requests, new instances still cold-start. Provisioned concurrency is the correct solution for predictable concurrency requirements.

Fix 6: Optimize Runtime-Specific Cold Start Performance

Node.js:

// Use ES modules carefully — CJS loads faster in some cases
// Avoid dynamic requires inside the handler
// Use esbuild or webpack to bundle and tree-shake

// esbuild example
// esbuild src/handler.ts --bundle --platform=node --target=node20 --outfile=dist/handler.js

Python:

# Lazy-load heavy modules inside the handler for rarely-used code paths
def handler(event, context):
    if event.get("action") == "generate_report":
        import pandas as pd  # Only loaded when needed
        # ...

Go and Rust: These compile to native binaries with minimal cold starts (~10ms). If cold starts are critical and you have flexibility in language choice, Go or Rust Lambda functions have near-zero initialization overhead.

Fix 7: Measure and Monitor Cold Starts

Find cold starts in CloudWatch Logs Insights:

fields @timestamp, @duration, @initDuration, @billedDuration
| filter ispresent(@initDuration)
| sort @timestamp desc
| limit 100

@initDuration is only present in cold start invocations. This query shows all cold starts with their initialization time.

Set up an alarm for high cold start frequency:

aws cloudwatch put-metric-alarm \
  --alarm-name lambda-cold-starts \
  --metric-name InitDuration \
  --namespace AWS/Lambda \
  --dimensions Name=FunctionName,Value=my-function \
  --statistic SampleCount \
  --period 300 \
  --threshold 10 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 1 \
  --alarm-actions arn:aws:sns:us-east-1:123:my-topic

Still Not Working?

Check VPC configuration. Lambda functions inside a VPC have longer cold starts because they need to set up ENIs (Elastic Network Interfaces). AWS improved VPC cold starts significantly in 2019–2020, but they are still longer than non-VPC functions. Only put Lambda in a VPC if you need access to VPC resources (RDS in private subnet, ElastiCache).

Check memory allocation. Lambda allocates CPU proportional to memory. A function with 128MB gets minimal CPU — increasing memory to 512MB or 1024MB often reduces both cold start and execution time, sometimes resulting in lower overall cost (less billable duration despite more memory cost per ms).

# Use AWS Lambda Power Tuning to find the optimal memory setting
# https://github.com/alexcasalboni/aws-lambda-power-tuning

Check for initialization errors. If your cold start code throws an error, Lambda retries the initialization on every invocation — making every call a cold start. Check CloudWatch Logs for initialization errors.

Check reserved concurrency limits. Setting reserved concurrency too low causes throttling, and every throttled retry that succeeds may land on a new environment that cold-starts. Inspect ConcurrentExecutions against Throttles for the same time window. Raising reserved concurrency (or removing the limit entirely) often eliminates a chunk of “phantom” cold starts that look like a runtime problem but are really a capacity problem.

Check the deployment artifact for hidden weight. Large CSS, source maps, test fixtures, or .git directories sometimes ship inside Lambda zips by accident. Run unzip -l deployment.zip | sort -nrk 3 | head -20 to find the biggest files. Removing a single 50MB sourcemap from a Node.js bundle can cut cold start by hundreds of milliseconds.

Check whether ARM (Graviton2) is faster for your workload. Switching the architecture from x86_64 to arm64 reduces both init and execution time for many runtimes and costs 20% less per ms. The migration is usually a one-line change in the function configuration plus rebuilding any native dependencies for ARM.

For Lambda import errors specifically, see Fix: AWS Lambda Import Module Error, and for Lambda timeouts unrelated to cold starts, see Fix: AWS Lambda Timeout. For SnapStart-specific problems, see Fix: AWS Lambda SnapStart Not Working, and for missing log lines that hide cold-start data, see Fix: AWS CloudWatch Logs Not Appearing.