Skip to content

Fix: Gradio Not Working — Share Link, Queue Timeout, and Component Errors

FixDevs ·

Quick Answer

How to fix Gradio errors — share link not working, queue timeout, component not updating, Blocks layout mistakes, flagging permission denied, file upload size limit, and HuggingFace Spaces deployment failures.

The Error

You launch a Gradio app and the share link doesn’t work:

Could not create share link. Please check your internet connection or
our status page: https://status.gradio.app

Or the app hangs during inference:

Queue timeout: The request exceeded the maximum time limit of 60 seconds.

Or a component doesn’t update after the function returns:

def predict(image):
    result = model(image)
    return result   # Output component stays empty

demo = gr.Interface(fn=predict, inputs="image", outputs="text")

Or you deploy to HuggingFace Spaces and the app fails to build:

ERROR: Could not install packages due to an EnvironmentError

Gradio is designed to make ML demos easy — one function, one decorator, a shareable URL. But the simplicity hides an event system, a queue, and layout rules that break in non-obvious ways when your use case goes beyond a basic demo.

Why This Happens

Gradio wraps a FastAPI server behind a Python function interface. Each user interaction triggers a request to the server, which calls your Python function and returns the result to the browser. The share=True feature routes traffic through Gradio’s relay servers — if those servers are unreachable or your function is too slow, the link fails or times out.

Component mismatches — returning the wrong type, the wrong number of outputs, or updating a component that isn’t wired to the function — produce silent failures rather than loud errors, because Gradio tries to gracefully handle type conversion.

Could not create share link. Please check your internet connection.

share=True creates a public URL by tunneling through Gradio’s relay servers. This requires outbound internet access and that Gradio’s servers are online.

Check the basics:

import gradio as gr

# Verify Gradio version
print(gr.__version__)   # Share requires 3.0+

demo = gr.Interface(fn=lambda x: x, inputs="text", outputs="text")
demo.launch(share=True)   # Requires internet access

If the share link fails:

  1. Check internet connectivitycurl https://api.gradio.app from the machine running the app
  2. Check Gradio status — visit status.gradio.app
  3. Firewall blocking outbound connections — corporate networks and cloud VMs may block the tunnel. Try share=True from a different network to confirm.
  4. Gradio version too old — update with pip install --upgrade gradio

Alternative: expose via SSH tunnel when share links are blocked:

# On the server running Gradio (no share=True needed)
python app.py   # Starts on localhost:7860

# On your local machine
ssh -L 7860:localhost:7860 user@server
# Open http://localhost:7860 in your browser

Set a custom server name and port:

demo.launch(
    server_name="0.0.0.0",   # Bind to all interfaces (required for Docker)
    server_port=7860,
    share=False,              # Don't use Gradio relay
)

Authentication:

demo.launch(
    auth=("admin", "password"),   # Basic auth
    share=True,
)

# Or multiple users
demo.launch(
    auth=[("user1", "pass1"), ("user2", "pass2")],
)

Fix 2: Queue Timeout — Long-Running Inference

Queue timeout: The request exceeded the maximum time limit of 60 seconds.

By default, Gradio queues requests and applies a timeout. If your model takes longer than the timeout, the request is dropped.

Increase the timeout:

import gradio as gr

demo = gr.Interface(fn=slow_model, inputs="text", outputs="text")
demo.queue(
    default_concurrency_limit=1,   # Process one request at a time
)
demo.launch()

Configure queue settings:

demo.queue(
    default_concurrency_limit=1,   # How many requests process simultaneously
    max_size=20,                    # Max queued requests (excess rejected)
    api_open=True,                  # Allow API access alongside UI
)

For very slow models, add a progress bar so users know the app isn’t frozen:

import gradio as gr
import time

def generate(prompt, progress=gr.Progress()):
    progress(0, desc="Loading model...")
    time.sleep(2)

    progress(0.3, desc="Generating...")
    time.sleep(3)

    progress(0.8, desc="Post-processing...")
    time.sleep(1)

    progress(1.0, desc="Done!")
    return f"Generated text for: {prompt}"

demo = gr.Interface(fn=generate, inputs="text", outputs="text")
demo.queue()
demo.launch()

Streaming output — for LLMs and other token-by-token generators:

import gradio as gr
import time

def stream_response(message):
    response = ""
    for word in message.split():
        response += word + " "
        time.sleep(0.1)
        yield response   # yield sends partial results to the UI

demo = gr.Interface(fn=stream_response, inputs="text", outputs="text")
demo.queue()
demo.launch()

yield instead of return enables streaming — the output component updates incrementally as each chunk arrives.

GPU concurrency on HuggingFace Spaces — GPU Spaces default to processing one request at a time. Set default_concurrency_limit to handle multiple users:

demo.queue(default_concurrency_limit=2)   # 2 concurrent GPU requests

Fix 3: Output Component Not Updating

The function runs without error but the output stays empty or shows the wrong content. Almost always a type mismatch between what the function returns and what the output component expects.

Check the return type matches the output component:

import gradio as gr
from PIL import Image
import numpy as np

# WRONG — function returns a NumPy array, output expects PIL Image path
def process(image):
    arr = np.array(image)
    result = arr * 0.5   # Darken
    return result   # NumPy array — Gradio may not render this correctly

# CORRECT — return a type the output component understands
def process(image):
    arr = np.array(image)
    result = (arr * 0.5).astype(np.uint8)
    return Image.fromarray(result)   # PIL Image — works with gr.Image output

demo = gr.Interface(fn=process, inputs="image", outputs="image")
demo.launch()

Multiple outputs — return a tuple matching the number of output components:

import gradio as gr

def analyze(text):
    word_count = len(text.split())
    char_count = len(text)
    # Must return exactly 2 values — one per output component
    return f"Words: {word_count}", f"Characters: {char_count}"

demo = gr.Interface(
    fn=analyze,
    inputs="text",
    outputs=["text", "text"],   # 2 outputs
)
demo.launch()

Common type mappings:

ComponentExpected return type
gr.Text / "text"str
gr.Image / "image"PIL.Image, np.ndarray, or file path str
gr.Audio / "audio"(sample_rate, np.ndarray) tuple or file path
gr.Video / "video"File path str
gr.File / "file"File path str
gr.JSON / "json"dict or list
gr.DataFrame / "dataframe"pd.DataFrame
gr.Plot / "plot"matplotlib.figure.Figure or plotly.graph_objects.Figure
gr.Label / "label"dict (label → confidence)

gr.Label for classification output:

def classify(image):
    # Return dict of label → confidence
    return {"cat": 0.85, "dog": 0.10, "bird": 0.05}

demo = gr.Interface(fn=classify, inputs="image", outputs="label")

Fix 4: Blocks Layout — Common Mistakes

gr.Blocks is the flexible layout API. Unlike gr.Interface, it doesn’t auto-wire inputs and outputs — you must connect them explicitly with event handlers.

import gradio as gr

# WRONG — no event handler connecting button to function
with gr.Blocks() as demo:
    input_text = gr.Textbox(label="Input")
    output_text = gr.Textbox(label="Output")
    btn = gr.Button("Submit")
    # Button does nothing — no .click() handler

# CORRECT — wire the button click to a function
with gr.Blocks() as demo:
    input_text = gr.Textbox(label="Input")
    output_text = gr.Textbox(label="Output")
    btn = gr.Button("Submit")

    btn.click(
        fn=lambda x: x.upper(),
        inputs=input_text,
        outputs=output_text,
    )

demo.launch()

Rows and columns for layout:

import gradio as gr

with gr.Blocks() as demo:
    gr.Markdown("# Image Classifier")

    with gr.Row():
        with gr.Column(scale=1):
            image_input = gr.Image(label="Upload Image")
            btn = gr.Button("Classify", variant="primary")
        with gr.Column(scale=2):
            label_output = gr.Label(label="Prediction")
            json_output = gr.JSON(label="Raw Scores")

    btn.click(
        fn=classify_image,
        inputs=image_input,
        outputs=[label_output, json_output],
    )

Tabs:

with gr.Blocks() as demo:
    with gr.Tab("Text"):
        text_input = gr.Textbox()
        text_output = gr.Textbox()
        gr.Button("Process").click(fn=process_text, inputs=text_input, outputs=text_output)

    with gr.Tab("Image"):
        img_input = gr.Image()
        img_output = gr.Image()
        gr.Button("Process").click(fn=process_image, inputs=img_input, outputs=img_output)

Pro Tip: Components defined inside with gr.Blocks() are just Python variables — you can reference any of them in any event handler, regardless of which Row/Column/Tab they’re inside. The visual layout (Rows/Columns) and the data flow (event handlers) are completely independent.

Fix 5: File Upload Size Limit

413: Request Entity Too Large

Gradio limits upload size by default. For large model inputs (videos, datasets, high-res images), increase the limit:

import gradio as gr

demo = gr.Interface(fn=process_video, inputs="video", outputs="video")
demo.launch(
    max_file_size="100mb",   # Gradio 4.0+
)

For older Gradio versions, the limit is controlled by the underlying FastAPI/Starlette server. Set it in the environment:

export GRADIO_MAX_FILE_SIZE=100  # In MB
python app.py

File upload handling in code:

import gradio as gr

def process_file(file):
    # file is a tempfile path (str) in Gradio 4.0+
    # In older versions it was a NamedTemporaryFile object
    print(f"File path: {file}")

    with open(file, 'r') as f:
        content = f.read()

    return f"File has {len(content)} characters"

demo = gr.Interface(
    fn=process_file,
    inputs=gr.File(label="Upload a text file"),
    outputs="text",
)
demo.launch()

Gradio 4.x changed file handlingUploadedFile objects are now temp file paths (strings), not file-like objects. If your code does file.read() and fails with AttributeError: 'str' object has no attribute 'read', update to use open(file, 'r').

Fix 6: Flagging Permission Denied

PermissionError: [Errno 13] Permission denied: 'flagged/'

Flagging lets users mark problematic inputs. By default, Gradio saves flagged data to a flagged/ directory in the current working directory.

Change or disable the flagging directory:

import gradio as gr

# Change the directory
demo = gr.Interface(
    fn=predict,
    inputs="text",
    outputs="text",
    flagging_dir="./user_flags",   # Writable location
)

# Disable flagging entirely
demo = gr.Interface(
    fn=predict,
    inputs="text",
    outputs="text",
    allow_flagging="never",   # Options: "never", "auto", "manual"
)

On HuggingFace Spaces, the filesystem is read-only except for specific directories. Use /tmp/flagged or disable flagging:

demo = gr.Interface(
    fn=predict,
    inputs="text",
    outputs="text",
    flagging_dir="/tmp/flagged",   # Writable on Spaces
)

Fix 7: HuggingFace Spaces Deployment

The requirements.txt must list all dependencies:

gradio>=4.0.0
torch>=2.0.0
transformers>=4.30.0
Pillow>=9.0.0

app.py must be in the repo root. Spaces looks for app.py as the entry point:

my-space/
├── app.py              ← Entry point
├── requirements.txt
├── model/              ← Model files (or download at runtime)
└── README.md           ← Must include Spaces metadata

README.md must include the Spaces header:

---
title: My ML Demo
emoji: 🔬
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
---

GPU Spaces — select a GPU hardware tier in the Space settings, then verify your code runs on GPU:

import torch
import gradio as gr

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Running on: {device}")

model = load_model().to(device)

def predict(image):
    tensor = preprocess(image).to(device)
    with torch.no_grad():
        output = model(tensor)
    return postprocess(output)

demo = gr.Interface(fn=predict, inputs="image", outputs="label")
demo.queue()
demo.launch()

Common deployment failures:

  • Missing system dependencies — add an apt.txt file for system packages:
    # apt.txt
    libgl1-mesa-glx
    libglib2.0-0
  • Model too large for disk — use huggingface_hub to download at runtime:
    from huggingface_hub import hf_hub_download
    model_path = hf_hub_download("username/model", "model.safetensors")

For HuggingFace token authentication and model loading patterns, see HuggingFace Transformers not working.

Fix 8: ChatInterface for Conversational AI

Gradio 3.34+ includes gr.ChatInterface — purpose-built for chatbots. If you’re building a chat UI with raw Blocks and struggling with history management, switch to this:

import gradio as gr

def chat(message, history):
    # message: current user message (str)
    # history: list of [user_msg, bot_msg] pairs
    response = f"You said: {message}. History has {len(history)} turns."
    return response

demo = gr.ChatInterface(
    fn=chat,
    title="My Chatbot",
    description="Ask me anything",
    examples=["Hello", "What can you do?", "Tell me a joke"],
    retry_btn="Retry",
    undo_btn="Undo",
    clear_btn="Clear",
)
demo.queue()
demo.launch()

Streaming chat responses:

import gradio as gr
import time

def chat_stream(message, history):
    response = ""
    for word in f"Echo: {message}".split():
        response += word + " "
        time.sleep(0.1)
        yield response   # Streams token by token

demo = gr.ChatInterface(fn=chat_stream)
demo.queue()
demo.launch()

Common Mistake: Forgetting .queue() before .launch() when using streaming. Without the queue, yield doesn’t stream — the entire response is buffered and sent at once, defeating the purpose of streaming output.

Still Not Working?

Gradio 3.x to 4.x Migration

Gradio 4.0 introduced breaking changes. The most impactful:

Gradio 3.xGradio 4.x
gr.inputs.Image()gr.Image()
gr.outputs.Label()gr.Label()
File inputs are file-like objectsFile inputs are temp path strings
concurrency_countdefault_concurrency_limit
gr.Interface(live=True)Use .change() event in Blocks

Check your version: python -c "import gradio; print(gradio.__version__)".

Installation Failures

If pip install gradio fails with build errors on older Python versions or unusual platforms, Gradio requires Python 3.8+ and has native dependencies. Try:

pip install --upgrade pip
pip install gradio

For general pip build failures, see pip could not build wheels.

Custom CSS and JavaScript

with gr.Blocks(css=".gradio-container {max-width: 800px; margin: auto;}") as demo:
    gr.Markdown("# Styled App")
    # ...

# Or load from file
with gr.Blocks(css="style.css") as demo:
    ...

Integrating with Streamlit or Jupyter

Gradio apps can be embedded in Jupyter notebooks:

demo.launch(inline=True)   # Renders inside the notebook cell

For Streamlit integration, embed via iframe. For standalone Streamlit apps with similar ML demo patterns, see Streamlit not working. For Jupyter-specific rendering issues, see Jupyter not working.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles