Fix: Tenacity Not Working — Retries Not Firing, Exception Filters, and Async Support
Part of: Python Errors
Quick Answer
How to fix Tenacity errors — retry decorator not retrying, stop_after_attempt vs stop_after_delay, retry_if_exception_type filter, async retry decorator, jitter for backoff, and RetryError unwrap original exception.
The Error
You decorate a function with @retry and it doesn’t actually retry on failure:
from tenacity import retry
@retry
def flaky():
raise ValueError("fail")
flaky()
# Just raises ValueError immediately — no retryOr retries fire on the wrong exception types:
@retry
def fetch():
raise FileNotFoundError("data.csv")
# Retries forever on a permanent errorOr async functions don’t get retried:
@retry(stop=stop_after_attempt(3))
async def call_api():
response = await httpx.get("...")
response.raise_for_status()
asyncio.run(call_api())
# Raises immediately, never retries — needs the async variantOr you can’t extract the original error from RetryError:
try:
flaky()
except RetryError as e:
print(e)
# RetryError[<Future at 0x... state=finished raised ValueError>]
# But what was the ValueError message?Or backoff is too aggressive and hammers the API:
@retry(wait=wait_fixed(0))
def hit_api():
raise ConnectionError()
# Retries thousands of times per second, banned in 30 secondsTenacity is the universal Python retry library — used by httpx wrappers, OpenAI SDK retries, Celery internal retries, every API client that needs backoff. The defaults are deliberately conservative (retry indefinitely on any exception), which causes its own set of problems. This guide covers the common patterns and pitfalls.
Why This Happens
Tenacity’s @retry decorator without arguments uses defaults that surprise newcomers: it retries on any exception forever, with no delay. This is rarely what you want. Production code needs explicit stop=, wait=, and retry= conditions.
The async support requires either the same @retry decorator (Tenacity 8.x+ detects async automatically) or the explicit @retry with proper handling. Mixing sync and async retry decorators on the wrong function type produces silent failures.
Fix 1: Default Behavior — Why Your Retry “Didn’t Work”
from tenacity import retry
@retry
def flaky():
raise ValueError("fail")
flaky()This does retry — but forever, with no delay. The function appears to “do nothing” because it’s stuck in an infinite retry loop. You hit Ctrl+C and see the trace.
Always specify stop and wait:
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(
stop=stop_after_attempt(5), # Max 5 attempts
wait=wait_exponential(multiplier=1, min=1, max=30),
)
def flaky():
raise ValueError("fail")
try:
flaky()
except RetryError:
print("All retries exhausted")Common Mistake: Decorating with @retry (no args) and assuming Tenacity has reasonable defaults. It doesn’t — the defaults are “retry forever on any exception with no delay.” Always add stop= to bound the retry count or time.
Fix 2: Stop Conditions
from tenacity import (
retry, stop_after_attempt, stop_after_delay,
stop_when_event_set, stop_never,
)
import threading
# Stop after N attempts
@retry(stop=stop_after_attempt(5))
def f(): ...
# Stop after total time elapsed
@retry(stop=stop_after_delay(60)) # 60 seconds total
def g(): ...
# Combine stop conditions (whichever fires first)
@retry(stop=stop_after_attempt(10) | stop_after_delay(60))
def h(): ...
# Stop when an event is set (e.g., cancellation signal)
stop_event = threading.Event()
@retry(stop=stop_when_event_set(stop_event))
def i(): ...Composed stops — | means OR:
# Stop if either 10 attempts OR 5 minutes have passed
@retry(stop=(stop_after_attempt(10) | stop_after_delay(300)))
def fn(): ...stop_never (default if no stop= is given) — retry forever. Only sensible for actually-recoverable transient errors with proper backoff.
Pro Tip: Always pair stop_after_attempt with stop_after_delay. Attempt count alone doesn’t bound time (if wait grows exponentially, 10 attempts could take hours). Delay alone doesn’t bound attempts (a fast-failing function could retry hundreds of times). Combining both gives upper bounds on both axes.
Fix 3: Wait Strategies
from tenacity import (
retry, stop_after_attempt,
wait_fixed, wait_random, wait_exponential, wait_chain,
)
# Fixed delay
@retry(stop=stop_after_attempt(5), wait=wait_fixed(2)) # 2s between retries
def f(): ...
# Random delay
@retry(stop=stop_after_attempt(5), wait=wait_random(min=1, max=5))
def g(): ...
# Exponential backoff
@retry(
stop=stop_after_attempt(8),
wait=wait_exponential(multiplier=1, min=1, max=60),
# 1s, 2s, 4s, 8s, 16s, 32s, 60s (capped), 60s
)
def h(): ...
# Exponential with jitter (avoid thundering herd)
from tenacity import wait_random_exponential
@retry(stop=stop_after_attempt(5), wait=wait_random_exponential(multiplier=1, max=60))
def i(): ...
# Combined strategy
@retry(wait=wait_chain(*[wait_fixed(1)] * 3 + [wait_exponential(min=2)]))
# First 3 retries: 1s each; then exponential
def j(): ...Wait strategy comparison:
| Strategy | Pattern | Use for |
|---|---|---|
wait_fixed(n) | n, n, n, n… | Constant interval polling |
wait_random(a, b) | Random in [a, b] | Avoiding sync herds |
wait_exponential(...) | 1, 2, 4, 8, 16… | API rate limit recovery |
wait_random_exponential(...) | Jittered exponential | Production API clients |
wait_chain(*ws) | Sequence of strategies | Custom multi-phase backoff |
wait_random_exponential is the standard for API clients — exponential growth bounded by max, plus jitter to avoid thundering herd when many clients fail and retry simultaneously.
Fix 4: Retry on Specific Exceptions
from tenacity import retry, retry_if_exception_type, stop_after_attempt
import requests
@retry(
stop=stop_after_attempt(5),
retry=retry_if_exception_type(requests.ConnectionError),
)
def fetch(url):
return requests.get(url, timeout=5).json()Now only ConnectionError triggers retry; ValueError, KeyError, etc. propagate immediately.
Multiple exception types:
@retry(
retry=retry_if_exception_type((ConnectionError, TimeoutError, requests.HTTPError)),
stop=stop_after_attempt(5),
)
def fetch(): ...Retry on HTTP status code:
import requests
from tenacity import retry, retry_if_exception, stop_after_attempt
def is_retryable_http_error(exception):
return (
isinstance(exception, requests.HTTPError)
and exception.response.status_code in (429, 500, 502, 503, 504)
)
@retry(
stop=stop_after_attempt(5),
retry=retry_if_exception(is_retryable_http_error),
wait=wait_exponential(multiplier=2, max=60),
)
def fetch():
response = requests.get("...")
response.raise_for_status()
return response.json()Retry on result (no exception, just a value indicating retry):
from tenacity import retry, retry_if_result, stop_after_attempt
@retry(
stop=stop_after_attempt(10),
retry=retry_if_result(lambda result: result is None),
wait=wait_fixed(1),
)
def poll_for_value():
value = check_external_system()
return value # Returns None until ready, then real value
result = poll_for_value()Combine conditions (OR):
from tenacity import retry_if_exception_type, retry_if_result
@retry(
retry=(retry_if_exception_type(ConnectionError) | retry_if_result(lambda r: r is None)),
stop=stop_after_attempt(10),
)
def fn(): ...Common Mistake: Using bare retry=retry_if_exception_type(Exception) to “retry on all exceptions” — this retries on KeyboardInterrupt, SystemExit, syntax errors caught at runtime, and AssertionErrors from your test framework. Always specify the actual exception classes you want to recover from.
Fix 5: Async Support
from tenacity import retry, stop_after_attempt, wait_exponential
import httpx
import asyncio
@retry(
stop=stop_after_attempt(5),
wait=wait_exponential(min=1, max=30),
)
async def call_api(): # Tenacity 8.x+ detects async automatically
async with httpx.AsyncClient() as client:
response = await client.get("https://api.example.com/data")
response.raise_for_status()
return response.json()
asyncio.run(call_api())For older Tenacity versions (< 8.0), use the explicit AsyncRetrying:
from tenacity import AsyncRetrying, stop_after_attempt, RetryError
async def call_api():
try:
async for attempt in AsyncRetrying(
stop=stop_after_attempt(5),
wait=wait_exponential(min=1, max=30),
reraise=True,
):
with attempt:
async with httpx.AsyncClient() as client:
response = await client.get("...")
response.raise_for_status()
return response.json()
except RetryError:
return NoneFor httpx-specific patterns that benefit from tenacity retries, see httpx not working.
Platform Differences and Runtime Interactions
Tenacity is a thin library, but the runtime it sits inside changes its behavior substantially. The same @retry decorator behaves differently under asyncio vs trio, inside AWS Lambda, and when it stacks on top of an SDK that already retries internally.
Sync vs async — same decorator, different machinery. From Tenacity 8.0 onward, @retry detects whether the wrapped callable is async def and switches to async-aware sleeping (asyncio.sleep) instead of time.sleep. Older versions silently used blocking time.sleep even on coroutines — the event loop froze between attempts. If you’re on Tenacity 7.x or older, upgrade or use AsyncRetrying explicitly. Check with pip show tenacity.
asyncio vs trio. Tenacity’s async support is built around asyncio primitives. Under trio (or anyio in trio mode), the default asyncio.sleep doesn’t cooperate with trio’s cancellation scope. Pass a trio-compatible sleeper:
import trio
from tenacity import AsyncRetrying, stop_after_attempt
async def fetch():
async for attempt in AsyncRetrying(
stop=stop_after_attempt(5),
sleep=trio.sleep,
reraise=True,
):
with attempt:
return await do_request()Anyio-based code usually works because anyio bridges both, but explicit sleepers are safer.
AWS Lambda — retries inside a function with its own retry policy. Lambda invocations themselves retry on async invokes (SNS, S3 events) up to twice by default. If Tenacity inside the handler also retries 5 times, you get 5 × 3 = 15 attempts total — and you pay for every second of execution. Inside Lambda, keep stop_after_delay tighter than the function timeout, and prefer Lambda’s native retry for transient failures of the whole invocation. Use Tenacity only for inner calls to flaky downstream services that finish well within the timeout budget.
OpenAI and Anthropic SDK overlap. Both SDKs retry internally on 429 and 5xx by default — typically 2 retries with exponential backoff. Wrapping their client calls in your own @retry produces double retries: 2 SDK × 5 Tenacity = 10 actual attempts per logical call. Either disable the SDK’s retries (OpenAI(max_retries=0), Anthropic(max_retries=0)) and let Tenacity own it, or trust the SDK and skip Tenacity for those calls. Don’t stack both.
Lambda cold start vs wait_exponential defaults. Lambda cold starts can take 500–2000ms. If wait_exponential(min=1) triggers on a cold-start-induced timeout, you’ve already burned half your retry budget on Lambda warming up. Use wait_random_exponential(multiplier=0.5, max=10) inside Lambda for tighter, more predictable backoff.
Celery and RQ — don’t double-retry. Celery has its own retry mechanism (self.retry(exc=e, countdown=60)) that requeues the task. Tenacity inside a Celery task retries within the same worker process. Either approach works; mixing them creates surprising delays. For transient failures in tight loops (HTTP call inside a task), use Tenacity. For task-level failures that benefit from being put back on the queue and picked up by a different worker, use Celery’s retry.
Logger injection. before_sleep_log(logger, level) writes through Python’s stdlib logging. If your service uses Structlog, the stdlib log line bypasses Structlog’s processor chain and looks different from every other log. Inject a structured callback instead:
def log_retry(retry_state):
log.warning(
"retry_attempt",
function=retry_state.fn.__name__,
attempt=retry_state.attempt_number,
next_wait=retry_state.next_action.sleep if retry_state.next_action else None,
exc_type=type(retry_state.outcome.exception()).__name__ if retry_state.outcome else None,
)
@retry(stop=stop_after_attempt(5), before_sleep=log_retry)
def fetch(): ...Now retries appear in the same JSON stream as everything else.
Synchronous code inside async handlers. Don’t decorate a sync function with @retry and call it from async code — time.sleep blocks the event loop. Either convert the inner function to async def or run it via asyncio.to_thread() so the blocking sleep happens off-loop.
Async generators and streaming — wrap the call that establishes the connection, not the iteration:
@retry(stop=stop_after_attempt(3))
async def open_stream():
response = await client.stream("GET", "https://example.com/sse")
response.raise_for_status()
return response
async def consume():
response = await open_stream()
async for chunk in response.aiter_bytes():
process(chunk)
# Don't retry the entire generator — that would re-open + re-iterateFix 6: Extracting the Original Exception from RetryError
from tenacity import retry, stop_after_attempt, RetryError
@retry(stop=stop_after_attempt(3))
def fail():
raise ValueError("specific message")
try:
fail()
except RetryError as e:
print(e)
# RetryError[<Future at ... raised ValueError>]The actual ValueError is hidden inside RetryError. Extract it:
try:
fail()
except RetryError as e:
original = e.last_attempt.exception()
print(type(original)) # <class 'ValueError'>
print(str(original)) # 'specific message'
raise original from e # Re-raise with original type and messagereraise=True — simpler pattern, propagates the original exception directly:
@retry(stop=stop_after_attempt(3), reraise=True)
def fail():
raise ValueError("specific")
try:
fail()
except ValueError as e:
print(e) # 'specific' — original exception, not wrappedPro Tip: Use reraise=True by default in production code. Callers see the actual exception type — ValueError, ConnectionError, whatever — instead of always handling RetryError. This makes error handling at the call site cleaner and respects exception type expectations.
Fix 7: Callback Hooks for Observability
from tenacity import retry, stop_after_attempt, before_log, after_log, before_sleep_log
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
@retry(
stop=stop_after_attempt(5),
before=before_log(logger, logging.INFO),
after=after_log(logger, logging.WARNING),
before_sleep=before_sleep_log(logger, logging.INFO),
)
def fetch():
...Output for a 3-attempt run:
INFO: Starting call to 'fetch', this is the 1st time calling it.
WARNING: Finished call to 'fetch' after 0.123(s), this was the 1st time calling it.
INFO: Retrying fetch in 1.0 seconds as it raised ConnectionError: ...
INFO: Starting call to 'fetch', this is the 2nd time calling it.
...Custom callback functions:
from tenacity import retry, stop_after_attempt, RetryCallState
def log_attempt(retry_state: RetryCallState):
fn_name = retry_state.fn.__name__
attempt = retry_state.attempt_number
print(f"Attempt {attempt} for {fn_name}")
if retry_state.outcome and retry_state.outcome.failed:
print(f" Exception: {retry_state.outcome.exception()}")
@retry(
stop=stop_after_attempt(5),
before=log_attempt,
)
def fetch(): ...Metrics integration (Prometheus, StatsD, etc.):
from prometheus_client import Counter
retry_counter = Counter("api_retries_total", "API retries", ["endpoint", "status"])
def track_retry(state):
endpoint = state.kwargs.get("endpoint", "unknown")
status = "failed" if state.outcome.failed else "success"
retry_counter.labels(endpoint=endpoint, status=status).inc()
@retry(stop=stop_after_attempt(5), before_sleep=track_retry)
def call(endpoint): ...Fix 8: Using Retrying Directly (No Decorator)
For dynamic retry logic, use Retrying as a context-managed object:
from tenacity import Retrying, stop_after_attempt, wait_fixed, RetryError
def make_request(url, max_retries):
retryer = Retrying(
stop=stop_after_attempt(max_retries),
wait=wait_fixed(1),
reraise=True,
)
try:
return retryer(do_request, url)
except Exception as e:
return {"error": str(e)}Iterator pattern:
from tenacity import Retrying, stop_after_attempt
for attempt in Retrying(stop=stop_after_attempt(5), reraise=True):
with attempt:
# Code inside `with attempt:` is retried
response = httpx.get("...")
response.raise_for_status()
result = response.json()
break # Exit the for loop on successThis pattern is especially useful when:
- Retry conditions depend on runtime values
- You want explicit control over what’s retried vs not
- You’re inside a function where decoration isn’t practical
tenacity.nap for sleep control:
from tenacity import Retrying, stop_after_attempt, wait_fixed
# Make sleeps interruptible via threading.Event
import threading
stop_event = threading.Event()
retryer = Retrying(
stop=stop_after_attempt(10),
wait=wait_fixed(5),
sleep=lambda seconds: stop_event.wait(seconds),
)
# In another thread/signal handler:
# stop_event.set() → wakes up the sleep earlyStill Not Working?
Tenacity vs Other Retry Libraries
- Tenacity — Most popular, rich API, async support. Default choice.
- Retry (
retrypackage) — Simpler decorator, fewer features. Use for quick scripts. - backoff — Different API, integrates well with async, also widely used.
- Built-in
requestsretries viaurllib3.util.retry.Retry— limited but no extra deps.
For most production code, Tenacity is worth the dependency.
Common Wait Strategy Defaults
For HTTP API clients, a sensible default:
@retry(
stop=(stop_after_attempt(5) | stop_after_delay(60)),
wait=wait_random_exponential(multiplier=1, max=30),
retry=retry_if_exception_type((ConnectionError, TimeoutError, HTTPError)),
reraise=True,
)
def api_call(): ...Bounded by both attempts and time, jittered exponential backoff, only retries on transient errors, re-raises the original exception.
Testing Functions with Retries
Mock the time/sleep to avoid actual waits in tests:
import pytest
from tenacity import retry, stop_after_attempt, wait_fixed
from unittest.mock import patch
@retry(stop=stop_after_attempt(3), wait=wait_fixed(5))
def fetch():
raise ConnectionError()
def test_retries_three_times():
with patch("tenacity.nap.time.sleep") as mock_sleep:
with pytest.raises(Exception):
fetch()
assert mock_sleep.call_count == 2 # Slept twice between 3 attemptsOr use a tiny wait in tests:
import os
RETRY_WAIT = 0.001 if os.getenv("TESTING") else 1
@retry(wait=wait_fixed(RETRY_WAIT))
def fn(): ...For pytest fixture patterns with retry mocking, see pytest fixture not found. For Loguru-based logging of retry attempts that pairs well with Tenacity’s before_sleep hooks, see Loguru not working.
Combining with httpx and Async APIs
import httpx
from tenacity import retry, stop_after_attempt, wait_random_exponential, retry_if_exception_type
@retry(
stop=stop_after_attempt(5),
wait=wait_random_exponential(multiplier=1, max=30),
retry=retry_if_exception_type((httpx.ConnectError, httpx.ReadTimeout)),
reraise=True,
)
async def fetch(client: httpx.AsyncClient, url: str):
response = await client.get(url, timeout=10)
response.raise_for_status()
return response.json()
async with httpx.AsyncClient() as client:
data = await fetch(client, "https://api.example.com/users")For OpenAI API integration where retries are essential due to rate limits, see OpenAI API not working.
Don’t Retry Idempotency-Breaking Operations
Some operations aren’t safe to retry:
- POST that creates resources (might create duplicates)
- Payment transactions
- Side-effects with no transaction boundary
For these, retry only on specific errors that guarantee the operation didn’t take effect (e.g., ConnectionError before sending the request body) or use idempotency keys.
import uuid
idempotency_key = str(uuid.uuid4())
@retry(stop=stop_after_attempt(5))
def create_payment(amount):
return requests.post(
"https://api.example.com/payments",
headers={"Idempotency-Key": idempotency_key},
json={"amount": amount},
)The server uses the key to deduplicate — first request creates, retries return the same result without creating duplicates.
Cancellation and KeyboardInterrupt Inside Retries
Tenacity’s default retry_if_exception_type(Exception) catches KeyboardInterrupt too, because it’s an Exception subclass in Python’s hierarchy — except KeyboardInterrupt is actually a BaseException. Most user code is safe. But asyncio cancellation throws asyncio.CancelledError, which is a BaseException in Python 3.8+. Tenacity won’t catch it by default, which is correct — cancellation should propagate. If you ever see retries spinning during shutdown, you probably caught BaseException somewhere upstream. Don’t.
Reading State From Inside the Retried Function
Sometimes the function needs to know which attempt it’s on (e.g., to choose a fallback endpoint on the second try). Use the retry_state injection:
from tenacity import retry, stop_after_attempt
@retry(stop=stop_after_attempt(3))
def fetch(url, *, retry_state=None):
attempt = retry_state.attempt_number if retry_state else 1
actual_url = url if attempt == 1 else url.replace("primary", "fallback")
return requests.get(actual_url).json()Tenacity passes retry_state automatically when the function signature declares it as a keyword-only arg.
Timeouts vs Retries — Layer Them Correctly
A common bug: setting wait_exponential(max=60) with stop_after_attempt(10) against an endpoint that itself has a 30-second timeout. The single attempt takes 30s, the wait grows to 60s, and 10 attempts take roughly 30 + 1 + 30 + 2 + 30 + 4 + 30 + 8 + 30 + 16 + 30 + 32 + 30 + 60 = over 5 minutes. Cap total time explicitly:
@retry(
stop=(stop_after_attempt(10) | stop_after_delay(60)),
wait=wait_exponential(max=10),
)
def call_with_budget(): ...stop_after_delay(60) is the budget; wait_exponential(max=10) keeps each gap short. Without both, retries either give up too early or run for minutes.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Tortoise ORM Not Working — Model Registration, Async Init, and Relationship Errors
How to fix Tortoise ORM errors — Tortoise.init not called, no module imported model, fetch_related missing, aerich migration setup, FastAPI integration patterns, and ConfigurationError missing connection.
Fix: asyncpg Not Working — Connection Pool, Prepared Statements, and Transaction Errors
How to fix asyncpg errors — connection refused localhost 5432, pool exhausted timeout, prepared statement does not exist, type codec not registered, JSON automatic conversion, and transaction rollback on exception.
Fix: aiohttp Not Working — Session Leaks, ClientTimeout, and Connector Errors
How to fix aiohttp errors — RuntimeError session is closed, ClientConnectorError connection refused, SSL verify failure, Unclosed client session warning, server websocket disconnect, and connector pool exhausted.
Fix: httpx Not Working — Async Client, Timeout, and Connection Pool Errors
How to fix httpx errors — RuntimeError event loop is closed, ReadTimeout exception, ConnectionResetError, async client not closing properly, HTTP/2 not enabled, SSL verify failed, and proxy not working.