Fix: MLflow Not Working — Tracking URI, Artifact Store, and Model Registry Errors

Q: How do I fix "MLflow Not Working — Tracking URI, Artifact Store, and Model Registry Errors"?

How to fix MLflow errors — no tracking server, artifact path not accessible, model version not found, experiment not found, MLFLOW_TRACKING_URI not set, autolog not recording metrics, and MLflow UI showing no runs.

The Error

You log metrics but the MLflow UI is empty — no runs, no experiments:

mlflow.log_metric("accuracy", 0.94)
# No error, but nothing appears in the UI

Or you try to load a registered model and get a version error:

MlflowException: Registered Model with name='my_model' not found.

Or artifact logging fails with a path error:

MlflowException: API request to http://localhost:5000/api/2.0/mlflow/runs/log-artifact failed with exception HTTPConnectionPool: Max retries exceeded

Or you run mlflow ui and it starts, but all your runs from the training script are missing.

MLflow separates tracking (metrics, parameters, tags), artifacts (files, models), and the model registry into three layers — each with its own storage backend. Misconfiguring any one of them produces silent failures or confusing errors. This guide covers all three.

Why This Happens

MLflow defaults to storing everything in a local ./mlruns directory relative to where you run your Python script. The MLflow UI, when launched with mlflow ui, looks in ./mlruns relative to where that command runs. If your training script and mlflow ui run from different directories, they use different mlruns folders and never see each other’s data.

The fix for any environment beyond a single local machine is to set MLFLOW_TRACKING_URI explicitly and point everything — training scripts, the UI, and any code that loads models — at the same backend.

Diagnostic Timeline — “I Logged It But Nothing’s There”

The reflex is “restart the server.” That fixes nothing in 90% of cases. The real causes split between three places: backend store vs artifact store mismatch, training script and UI pointing at different tracking URIs, or a model signature that doesn’t match the data you’re trying to predict on. Walk through it.

Minute 0 — Print the tracking URI from the training script. Add print("Training URI:", mlflow.get_tracking_uri()) before any start_run. If it prints file:///some/path/mlruns, that’s a local file store relative to where you ran python. Now open the UI and check its URL — if it’s http://localhost:5000 backed by SQLite, the two systems are not even looking at the same store.

Minute 1 — Verify the experiment ID, not just the name. Run mlflow experiments search from the same shell environment as your script. If your experiment doesn’t appear, your script created it in a different store. Names look the same; IDs and paths don’t.

Minute 3 — Distinguish backend store from artifact store. --backend-store-uri sqlite:///mlflow.db controls metrics/params/tags. --default-artifact-root ./mlartifacts controls models, plots, and files. The training machine must have write access to both. A common production bug: the server can write artifacts, but the training pod can’t reach the S3 path because it lacks credentials. The metrics row appears in the UI, but the model artifact column is empty.

Minute 5 — Check MLFLOW_S3_ENDPOINT_URL for non-AWS S3. If artifacts go to MinIO or Cloudflare R2, you need to set MLFLOW_S3_ENDPOINT_URL on every client that logs artifacts — not just the server. Without it, the boto3 client tries s3.amazonaws.com and the upload silently 403s. Watch the artifact section of the run page: missing files with metrics populated points right at this.

Minute 7 — Model signature drift. When you log a model with input_example=X_train[:5] and later try to predict on data with a renamed column, MLflow’s signature check raises a confusing schema error during predict, not at load time. If a freshly loaded model breaks on serving, dump model.metadata.signature and compare against your serving input.

Minute 9 — Autolog created a nested run. If you called mlflow.autolog() inside a with mlflow.start_run() block, autolog opens a child run. Your manually logged metrics go to the parent; the autolog metrics go to the child. The UI shows the parent as “empty” because the metrics you expect live one level down.

The first guess is always “restart the MLflow server.” The actual answer is usually a backend-store-vs-artifact-store split where one side lacks credentials, a tracking URI mismatch between training and UI, or a model signature that silently drifted after a schema change upstream.

Fix 1: Runs Not Appearing in the UI

The most common MLflow issue: training runs successfully but the UI shows “No experiments found.”

Root cause: tracking URI mismatch. Check where your script is logging vs. where the UI is reading:

import mlflow

# Where is your script logging?
print(mlflow.get_tracking_uri())
# Default: file:///path/to/your/script/directory/mlruns

# Where is the UI reading from?
mlflow ui
# Reads from ./mlruns in the CURRENT DIRECTORY

If you ran python train.py from /home/user/project/ and mlflow ui from /home/user/, they use completely different mlruns directories.

Fix — set MLFLOW_TRACKING_URI consistently:

# Set once in your shell — all MLflow calls in this session use this path
export MLFLOW_TRACKING_URI=file:///home/user/project/mlruns

# Now run training and UI from anywhere — they all point to the same store
python train.py
mlflow ui   # http://localhost:5000

Or set it in your training script:

import mlflow

# Set before any logging — absolute path avoids directory confusion
mlflow.set_tracking_uri("file:///home/user/project/mlruns")

with mlflow.start_run():
    mlflow.log_metric("accuracy", 0.94)

Or use a tracking server (the right approach for teams):

# Start the tracking server — stores data in ./mlruns, serves UI on port 5000
mlflow server \
    --backend-store-uri sqlite:///mlflow.db \
    --default-artifact-root ./mlartifacts \
    --host 0.0.0.0 \
    --port 5000

import mlflow

# Point all scripts at the server
mlflow.set_tracking_uri("http://localhost:5000")

with mlflow.start_run():
    mlflow.log_metric("loss", 0.12)

Fix 2: Experiment Not Found or Runs Go to Wrong Experiment

mlflow.exceptions.MlflowException: Experiment '0' does not exist.
mlflow.exceptions.MlflowException: Could not find experiment with name 'my_experiment'

By default, MLflow logs to an experiment called “Default” (ID 0). If you reference an experiment that doesn’t exist, the call fails. Runs from different scripts mixed in “Default” also make the UI hard to read.

Create and set an experiment before logging:

import mlflow

# Set by name — creates it if it doesn't exist (MLflow 1.x behavior)
mlflow.set_experiment("fraud_detection_v2")

# Explicit create-or-get pattern (more robust)
experiment_name = "fraud_detection_v2"
experiment = mlflow.get_experiment_by_name(experiment_name)
if experiment is None:
    mlflow.create_experiment(
        name=experiment_name,
        artifact_location="s3://my-bucket/mlflow/fraud_detection_v2",  # Optional
        tags={"team": "data-science", "project": "fraud"},
    )
mlflow.set_experiment(experiment_name)

# Now all runs go to this experiment
with mlflow.start_run(run_name="xgboost_baseline"):
    mlflow.log_param("n_estimators", 100)
    mlflow.log_metric("auc", 0.87)

Organize runs with tags for filtering in the UI:

with mlflow.start_run(run_name="experiment_001") as run:
    mlflow.set_tags({
        "model_type": "gradient_boosting",
        "dataset_version": "v3",
        "environment": "dev",
    })
    mlflow.log_params({"n_estimators": 100, "max_depth": 5, "learning_rate": 0.1})
    mlflow.log_metric("accuracy", 0.91)
    mlflow.log_metric("f1_score", 0.88)

    print(f"Run ID: {run.info.run_id}")

Log metrics over time (training curves):

with mlflow.start_run():
    for epoch in range(100):
        train_loss = train_one_epoch()
        val_loss = validate()
        # step parameter creates a time-series in the UI
        mlflow.log_metric("train_loss", train_loss, step=epoch)
        mlflow.log_metric("val_loss", val_loss, step=epoch)

Fix 3: Artifact Logging Failures

MlflowException: API request to .../log-artifact failed
MlflowException: No such file or directory: '/tmp/mlflow-artifacts/...'
OSError: [Errno 2] No such file or directory

MLflow stores artifacts (model files, plots, feature importance charts) in an artifact store that’s separate from the metrics database. Mismatches between where the server expects artifacts and where the client tries to write them cause these errors.

Log files as artifacts:

import mlflow
import matplotlib.pyplot as plt
import os

with mlflow.start_run():
    # Log a single file
    mlflow.log_artifact("feature_importance.csv")

    # Log a file to a subdirectory in the artifact store
    mlflow.log_artifact("confusion_matrix.png", artifact_path="plots")

    # Log an entire directory
    mlflow.log_artifacts("./output/", artifact_path="model_outputs")

    # Log in-memory content without writing to disk first
    import json
    config = {"model": "xgboost", "version": "1.0"}
    with open("/tmp/config.json", "w") as f:
        json.dump(config, f)
    mlflow.log_artifact("/tmp/config.json")

    # Log a matplotlib figure directly
    fig, ax = plt.subplots()
    ax.plot([1, 2, 3], [0.8, 0.85, 0.9], label="accuracy")
    ax.legend()
    mlflow.log_figure(fig, "training_curve.png")   # MLflow 1.13+
    plt.close(fig)

Artifact store configuration for remote backends:

# S3 backend
mlflow server \
    --backend-store-uri postgresql://user:pass@host/mlflow \
    --default-artifact-root s3://my-mlflow-bucket/artifacts \
    --host 0.0.0.0

# Google Cloud Storage
mlflow server \
    --default-artifact-root gs://my-mlflow-bucket/artifacts

# Azure Blob Storage
mlflow server \
    --default-artifact-root wasbs://[email protected]/artifacts

When using S3, the training machine must have AWS credentials with write access to the artifact bucket — not just the MLflow server:

# Training script runs on a different machine than MLflow server
# That machine needs S3 write access
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
# Or use IAM instance profile if on EC2

Pro Tip: Use mlflow.log_artifact for final artifacts (trained model, evaluation report). Use mlflow.log_metric at every epoch for training curves. Save large intermediate files (preprocessed datasets, checkpoints) directly to S3/GCS — only log the final model through MLflow. The artifact store isn’t a general-purpose file system.

Fix 4: Model Logging and Loading Errors

MlflowException: Run 'abc123' not found.
MlflowException: Model flavor 'sklearn' is not supported.
mlflow.exceptions.MlflowException: Registered Model with name='classifier' not found.

MLflow provides flavors for common frameworks. Each flavor knows how to save and load models in a way that preserves the prediction interface.

Log models using the correct flavor:

import mlflow
import mlflow.sklearn
import mlflow.pytorch
import mlflow.tensorflow
import mlflow.xgboost

# scikit-learn
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)

with mlflow.start_run() as run:
    mlflow.sklearn.log_model(
        sk_model=model,
        artifact_path="model",
        registered_model_name="fraud_classifier",   # Registers in Model Registry
        input_example=X_train[:5],                  # Captures input schema
    )

# PyTorch
with mlflow.start_run():
    mlflow.pytorch.log_model(
        pytorch_model=net,
        artifact_path="model",
        registered_model_name="image_classifier",
    )

# XGBoost
with mlflow.start_run():
    mlflow.xgboost.log_model(
        xgb_model=booster,
        artifact_path="model",
    )

Load a model by run ID:

import mlflow

run_id = "abc123def456"

# Load by run ID and artifact path
model = mlflow.sklearn.load_model(f"runs:/{run_id}/model")

# Load as generic Python function (works for any flavor)
pyfunc_model = mlflow.pyfunc.load_model(f"runs:/{run_id}/model")
predictions = pyfunc_model.predict(X_test)

Load a model from the Model Registry:

import mlflow.pyfunc

# Load a specific version
model = mlflow.pyfunc.load_model("models:/fraud_classifier/3")

# Load by stage (Production, Staging, Archived)
model = mlflow.pyfunc.load_model("models:/fraud_classifier/Production")

# MLflow 2.x: use aliases instead of deprecated stages
model = mlflow.pyfunc.load_model("models:/fraud_classifier@champion")

MLflow 2.x deprecated model stages (Production, Staging) in favor of aliases. If you’re on MLflow 2.x and getting deprecation warnings:

from mlflow import MlflowClient

client = MlflowClient()

# Old pattern — deprecated in 2.x
client.transition_model_version_stage(
    name="fraud_classifier",
    version="3",
    stage="Production",   # DeprecationWarning
)

# New pattern — set an alias
client.set_registered_model_alias(
    name="fraud_classifier",
    alias="champion",
    version="3",
)

# Load by alias
model = mlflow.pyfunc.load_model("models:/fraud_classifier@champion")

Fix 5: `autolog` Not Recording Metrics

mlflow.sklearn.autolog()
# Model trains, run appears in UI, but params/metrics are empty

autolog hooks into framework training methods to automatically capture parameters, metrics, and models. If nothing is recorded, the autolog call happened after the framework was already imported in a problematic state, or the training function isn’t one that autolog hooks into.

Call autolog before any model initialization:

import mlflow

# WRONG — autolog after model creation may miss constructor params
from sklearn.ensemble import GradientBoostingClassifier
model = GradientBoostingClassifier(n_estimators=100)
mlflow.sklearn.autolog()   # Too late — already initialized

# CORRECT — autolog before everything
mlflow.sklearn.autolog()   # Hook in first

from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_val_score

model = GradientBoostingClassifier(n_estimators=100, max_depth=3)
# fit() triggers the autolog hooks
model.fit(X_train, y_train)

Enable autolog for all supported frameworks at once:

import mlflow

mlflow.autolog(
    log_input_examples=True,    # Log sample inputs
    log_model_signatures=True,  # Log input/output schema
    log_models=True,            # Log the model artifact
    disable=False,
    exclusive=False,
    disable_for_unsupported_versions=False,
    silent=False,
)

Disable autolog for specific frameworks if it conflicts with your manual logging:

mlflow.sklearn.autolog(disable=True)

PyTorch Lightning autolog — must be set before the Trainer:

import mlflow
import pytorch_lightning as pl

mlflow.pytorch.autolog()   # Before Trainer instantiation

trainer = pl.Trainer(max_epochs=10)
trainer.fit(model, train_loader, val_loader)
# MLflow captures train_loss, val_loss per epoch automatically

Common Mistake: Using mlflow.autolog() inside a with mlflow.start_run(): block and expecting it to capture the outer run. autolog creates its own nested run for framework calls. For clean logging, set autolog before start_run, or use manual logging inside the with block and skip autolog.

Fix 6: MLflow Server Setup for Teams

Running mlflow ui locally only works for solo development. For teams, you need a tracking server with a real database backend.

Minimal production setup with PostgreSQL + S3:

# Install MLflow with extras
pip install mlflow[extras] psycopg2-binary boto3

# Start the server
mlflow server \
    --backend-store-uri "postgresql://mlflow_user:password@db-host:5432/mlflow" \
    --default-artifact-root "s3://company-mlflow/artifacts" \
    --host 0.0.0.0 \
    --port 5000 \
    --workers 4

Docker Compose setup for local team development:

# docker-compose.yml
version: "3.8"
services:
  mlflow-db:
    image: postgres:15
    environment:
      POSTGRES_USER: mlflow
      POSTGRES_PASSWORD: mlflow
      POSTGRES_DB: mlflow
    volumes:
      - mlflow-db-data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "mlflow"]
      interval: 5s
      retries: 5

  mlflow-server:
    image: python:3.12-slim
    depends_on:
      mlflow-db:
        condition: service_healthy
    command: >
      bash -c "pip install mlflow psycopg2-binary &&
               mlflow server
               --backend-store-uri postgresql://mlflow:mlflow@mlflow-db:5432/mlflow
               --default-artifact-root /mlartifacts
               --host 0.0.0.0
               --port 5000"
    ports:
      - "5000:5000"
    volumes:
      - mlflow-artifacts:/mlartifacts

  training:
    build: .
    environment:
      MLFLOW_TRACKING_URI: http://mlflow-server:5000
    depends_on:
      - mlflow-server

volumes:
  mlflow-db-data:
  mlflow-artifacts:

For service dependency and health check patterns in this Docker Compose setup, see docker-compose depends_on not working.

Initialize the database schema on first run:

mlflow db upgrade postgresql://mlflow_user:password@db-host:5432/mlflow

Without running db upgrade, the server fails to start with a schema error.

Fix 7: Version Compatibility Issues

MLflow has fast release cycles. Mixing MLflow versions between the training environment and the serving/loading environment causes compatibility errors:

MlflowException: The model's mlflow version (2.8.0) is incompatible with the current version (2.12.0).
mlflow.exceptions.MlflowException: Unsupported format for model artifacts.

Check versions across your environments:

# Training environment
python -c "import mlflow; print(mlflow.__version__)"

# Serving environment
mlflow --version

Pin MLflow in requirements.txt to avoid drift:

mlflow==2.14.0
mlflow-skinny==2.14.0   # Lighter install without server dependencies

mlflow-skinny is the client-only package — it doesn’t install the server, UI, or heavyweight dependencies. Use it in training environments and containers where you only need to log, not serve:

# Training container — only needs to log metrics
pip install mlflow-skinny boto3

# Serving container — needs full MLflow
pip install mlflow

MLflow model format evolution — models logged with older MLflow versions can usually be loaded by newer versions but not vice versa. If you must load an old model with new MLflow, check the MLmodel file in the artifact:

# Inspect the model metadata
cat mlartifacts/0/run_id/artifacts/model/MLmodel
# Shows: mlflow_version: 2.8.0, flavors, signature, etc.

Fix 8: Querying Runs Programmatically

The MLflow UI is useful for exploration, but production workflows need to query run history programmatically.

import mlflow
from mlflow.tracking import MlflowClient

client = MlflowClient(tracking_uri="http://localhost:5000")

# Search runs in an experiment
experiment = client.get_experiment_by_name("fraud_detection_v2")
runs = client.search_runs(
    experiment_ids=[experiment.experiment_id],
    filter_string="metrics.auc > 0.85 AND params.model_type = 'xgboost'",
    order_by=["metrics.auc DESC"],
    max_results=10,
)

for run in runs:
    print(f"Run {run.info.run_id}: AUC={run.data.metrics['auc']:.4f}")

# Get the best run
best_run = runs[0]
print(f"Best run ID: {best_run.info.run_id}")
print(f"Best AUC: {best_run.data.metrics['auc']}")
print(f"Params: {best_run.data.params}")

# Load the best model
best_model = mlflow.sklearn.load_model(f"runs:/{best_run.info.run_id}/model")

Download artifacts from a run:

client = MlflowClient()

# List artifacts in a run
artifacts = client.list_artifacts(run_id, path="model")
for artifact in artifacts:
    print(artifact.path, artifact.is_dir, artifact.file_size)

# Download to local path
local_path = client.download_artifacts(
    run_id=run_id,
    path="model",
    dst_path="/tmp/downloaded_model"
)

Compare runs with pandas:

import mlflow
import pandas as pd

# Export all runs to a DataFrame for analysis
df = mlflow.search_runs(
    experiment_names=["fraud_detection_v2"],
    filter_string="status = 'FINISHED'",
)

# Column names: params.*, metrics.*, tags.*, start_time, run_id, etc.
print(df[["params.n_estimators", "metrics.auc", "metrics.f1_score"]].sort_values("metrics.auc", ascending=False))

Still Not Working?

MLflow with Remote Tracking and Local Artifacts

If your training runs in a cloud VM but you want artifacts stored locally for debugging, override the artifact location per run:

with mlflow.start_run() as run:
    mlflow.log_artifact("model.pkl", artifact_path="model")
    # Artifact is stored at the server's artifact root by default

# Download artifact after the run
client = MlflowClient()
client.download_artifacts(run.info.run_id, "model", "/local/path")

Integration with Training Frameworks

For PyTorch training loop patterns that work well with MLflow logging (especially mixed precision and gradient accumulation), see PyTorch not working. For scikit-learn pipelines logged as MLflow models, see scikit-learn not working. For experiment notebooks where you prototype before moving to tracked training runs, see Jupyter not working.

MLflow on Kubernetes

The mlflow server command runs a single-process Flask app — not suitable for high-concurrency production workloads. For Kubernetes, use the official Helm chart or a managed service (Databricks Managed MLflow, AWS SageMaker MLflow Tracking). The server should run behind a load balancer with multiple --workers (Gunicorn workers) for parallel request handling.

Run Garbage Collection Without Losing Real Experiments

mlflow gc permanently deletes runs marked as lifecycle_stage='deleted' and frees database space. Without it, deleted runs accumulate forever as soft-deletes. Schedule mlflow gc --backend-store-uri ... weekly. Warning: verify your retention policy first — runs deleted via the UI are recoverable until gc runs, but gc is permanent. Don’t run it on production without testing on staging.

Conda Environment Capture Bloats Model Size

When you log a sklearn model, MLflow captures the conda environment as conda.yaml. By default it dumps every package in your active environment, which for an Anaconda base install can be hundreds of dependencies. The model directory inflates from 5MB to 500MB. Use pip_requirements=["scikit-learn==1.4.0", "pandas==2.2.0"] explicitly in log_model to capture only what the model actually needs. Faster load times, smaller artifact store, fewer dependency conflicts on deployment.

Signature Inference Silently Drops Optional Columns

infer_signature(X, y) looks at the first few rows of X. If those rows happen to have None in a column that’s normally populated, the inferred schema marks the column as optional. Later serving requests with that column present can fail with type coercion errors. Pass a representative sample (not the first 5 rows) or define the signature manually using mlflow.models.Schema.

Fix: MLflow Not Working — Tracking URI, Artifact Store, and Model Registry Errors

The Error

Why This Happens

Diagnostic Timeline — “I Logged It But Nothing’s There”

Fix 1: Runs Not Appearing in the UI

Fix 2: Experiment Not Found or Runs Go to Wrong Experiment

Fix 3: Artifact Logging Failures

Fix 4: Model Logging and Loading Errors

Fix 5: `autolog` Not Recording Metrics

Fix 6: MLflow Server Setup for Teams

Fix 7: Version Compatibility Issues

Fix 8: Querying Runs Programmatically

Still Not Working?

MLflow with Remote Tracking and Local Artifacts

Integration with Training Frameworks

MLflow on Kubernetes

Run Garbage Collection Without Losing Real Experiments

Conda Environment Capture Bloats Model Size

Signature Inference Silently Drops Optional Columns

Related Articles

Fix: Weights & Biases (wandb) Not Working — Login Errors, Init Hangs, and Sync Failures

Fix: FAISS Not Working — Import Errors, Index Selection, and GPU Setup

Fix: Gradio Not Working — Share Link, Queue Timeout, and Component Errors

Fix: Jupyter Notebook Not Working — Kernel Dead, Module Not Found, and Widget Errors

The Error

Why This Happens

Diagnostic Timeline — “I Logged It But Nothing’s There”

Fix 1: Runs Not Appearing in the UI

Fix 2: Experiment Not Found or Runs Go to Wrong Experiment

Fix 3: Artifact Logging Failures

Fix 4: Model Logging and Loading Errors

Fix 5: autolog Not Recording Metrics

Fix 6: MLflow Server Setup for Teams

Fix 7: Version Compatibility Issues

Fix 8: Querying Runs Programmatically

Still Not Working?

MLflow with Remote Tracking and Local Artifacts

Integration with Training Frameworks

MLflow on Kubernetes

Run Garbage Collection Without Losing Real Experiments

Conda Environment Capture Bloats Model Size

Signature Inference Silently Drops Optional Columns

Related Articles

Fix: Weights & Biases (wandb) Not Working — Login Errors, Init Hangs, and Sync Failures

Fix: FAISS Not Working — Import Errors, Index Selection, and GPU Setup

Fix: Gradio Not Working — Share Link, Queue Timeout, and Component Errors

Fix: Jupyter Notebook Not Working — Kernel Dead, Module Not Found, and Widget Errors

Fix 5: `autolog` Not Recording Metrics