Fix: Optuna Not Working — Trial Pruned, Storage Errors, and Search Space Problems
Quick Answer
How to fix Optuna errors — TrialPruned stops too early, RDB storage locked or not saving, suggest methods raise ValueError, parallel study workers deadlock, integration callbacks not reporting, and best trial not reproducible.
The Error
Your Optuna study runs but prunes every trial:
[I 2025-03-15 14:22:01] Trial 0 pruned.
[I 2025-03-15 14:22:03] Trial 1 pruned.
[I 2025-03-15 14:22:05] Trial 2 pruned.
...
[I 2025-03-15 14:22:45] Trial 24 pruned. 25/25 trials pruned.Or the study state isn’t saved between runs:
study = optuna.create_study(storage="sqlite:///optuna.db")
# Runs fine, but next time:
optuna.errors.DuplicatedStudyError: Another study with name 'no-name-xxx' already exists.Or suggest_* methods fail with confusing ranges:
ValueError: The value low=1.0 must be less than high=1.0 for 'learning_rate'.Or you try to run trials in parallel and the database locks up.
Optuna is deceptively simple — study.optimize(objective, n_trials=100) looks like one line, but the objective function, the sampler, the pruner, and the storage layer all interact in ways that produce silent failures or unexpected behavior. This guide covers the root causes.
Why This Happens
Optuna’s trial-based design means every training run is an independent trial managed by a study. The study records all trial parameters and results to a storage backend (in-memory by default, or an RDB like SQLite/PostgreSQL). The sampler (TPE by default) uses completed trial history to pick better parameters. The pruner (MedianPruner by default) kills underperforming trials early.
When the pruner is too aggressive, the sampler has too few completed trials to learn from — and the study degenerates into pruning everything. When the storage is SQLite and multiple workers write concurrently, database locking kills parallelism. When the search space is misconfigured, suggest_* calls fail silently or produce degenerate distributions.
Fix 1: Every Trial Gets Pruned
Trial 0 pruned. Trial 1 pruned. Trial 2 pruned...The pruner compares each trial’s intermediate values against completed trials. If there are very few completed trials (because early ones were also pruned), the pruner has no baseline — it prunes based on almost no data.
Fix 1: Warm up the pruner with n_warmup_steps:
import optuna
study = optuna.create_study(
direction="minimize",
pruner=optuna.pruners.MedianPruner(
n_startup_trials=10, # Don't prune the first 10 trials at all
n_warmup_steps=20, # Don't prune before step 20 in any trial
),
)n_startup_trials=10 lets the first 10 trials complete fully, giving the pruner a baseline. n_warmup_steps=20 skips pruning in the first 20 epochs of each trial, so early noise doesn’t trigger false pruning.
Fix 2: Report intermediate values correctly in the objective function:
import optuna
def objective(trial):
lr = trial.suggest_float("learning_rate", 1e-5, 1e-1, log=True)
n_layers = trial.suggest_int("n_layers", 1, 5)
model = build_model(lr, n_layers)
for epoch in range(100):
train_loss = train_one_epoch(model)
val_loss = validate(model)
# Report the intermediate value — pruner uses this
trial.report(val_loss, epoch)
# Check if this trial should be pruned
if trial.should_prune():
raise optuna.TrialPruned()
return val_loss # Final value
study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=50)Common mistake — reporting training loss instead of validation loss:
# WRONG — training loss always decreases, pruner can't detect overfitting
trial.report(train_loss, epoch)
# CORRECT — validation loss reveals actual model quality
trial.report(val_loss, epoch)Fix 3: Disable pruning entirely to verify the search space works:
study = optuna.create_study(
direction="minimize",
pruner=optuna.pruners.NopPruner(), # Never prunes — all trials complete
)
study.optimize(objective, n_trials=20)
# If all 20 trials now complete, the issue was pruner configuration
# Re-enable with more generous settingsFix 2: Storage Errors — SQLite Locking and Study Persistence
DuplicatedStudyError: Another study with name 'no-name-xxx' already exists.
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is lockedAlways name your studies — unnamed studies get random IDs that collide:
import optuna
# WRONG — unnamed study, ID collision on restart
study = optuna.create_study(storage="sqlite:///optuna.db")
# CORRECT — named study, survives restarts
study = optuna.create_study(
study_name="xgboost_tuning_v2",
storage="sqlite:///optuna.db",
direction="maximize",
load_if_exists=True, # Resume previous study instead of failing
)load_if_exists=True is the key — it loads the existing study if the name matches, instead of raising DuplicatedStudyError.
SQLite locks under concurrent access. If you’re running parallel workers, switch to PostgreSQL or MySQL:
# Single worker — SQLite is fine
study = optuna.create_study(
storage="sqlite:///optuna.db",
study_name="experiment_1",
load_if_exists=True,
)
# Multiple workers — use PostgreSQL
study = optuna.create_study(
storage="postgresql://user:password@localhost:5432/optuna",
study_name="experiment_1",
load_if_exists=True,
)Run parallel workers against the same PostgreSQL-backed study:
# Terminal 1
python optimize.py --study-name experiment_1
# Terminal 2 (same machine or different machine)
python optimize.py --study-name experiment_1
# Both workers share trials via the database# optimize.py
import optuna
study = optuna.load_study(
study_name="experiment_1",
storage="postgresql://user:pass@host/optuna",
)
study.optimize(objective, n_trials=50) # Each worker runs 50 trialsDelete a study to start fresh:
optuna.delete_study(study_name="experiment_1", storage="sqlite:///optuna.db")List all studies in a database:
studies = optuna.get_all_study_summaries(storage="sqlite:///optuna.db")
for s in studies:
print(f"{s.study_name}: {s.n_trials} trials, best={s.best_trial.value if s.best_trial else 'N/A'}")Fix 3: suggest_* Errors — Search Space Configuration
ValueError: The value low=1.0 must be less than high=1.0 for 'learning_rate'.
ValueError: Cannot suggest a value for parameter 'n_layers' with type int
when the parameter has been suggested with type float.Low must be strictly less than high:
# WRONG
lr = trial.suggest_float("lr", 0.1, 0.1) # low == high → error
# CORRECT
lr = trial.suggest_float("lr", 0.01, 0.1)Log-scale for parameters that span orders of magnitude:
# WRONG — uniform sampling across 0.0001 to 0.1 concentrates 99% near 0.1
lr = trial.suggest_float("learning_rate", 1e-4, 1e-1)
# CORRECT — log-uniform samples evenly across orders of magnitude
lr = trial.suggest_float("learning_rate", 1e-4, 1e-1, log=True)Consistent parameter types across trials — once a parameter name is used with a type, it can’t change:
# Trial 1: suggest_int
n_layers = trial.suggest_int("n_layers", 1, 5)
# Trial 2: suggest_float with same name — ERROR
n_layers = trial.suggest_float("n_layers", 1.0, 5.0) # Type conflict!Conditional parameters — parameters that only exist for certain configurations:
def objective(trial):
model_type = trial.suggest_categorical("model", ["svm", "rf", "xgb"])
if model_type == "svm":
C = trial.suggest_float("svm_C", 1e-3, 1e3, log=True)
kernel = trial.suggest_categorical("svm_kernel", ["rbf", "linear"])
model = SVC(C=C, kernel=kernel)
elif model_type == "rf":
n_estimators = trial.suggest_int("rf_n_estimators", 50, 500)
max_depth = trial.suggest_int("rf_max_depth", 3, 15)
model = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth)
elif model_type == "xgb":
lr = trial.suggest_float("xgb_lr", 0.01, 0.3, log=True)
n_estimators = trial.suggest_int("xgb_n_estimators", 50, 1000)
model = XGBClassifier(learning_rate=lr, n_estimators=n_estimators)
# Prefix parameter names per model to avoid type conflicts
score = cross_val_score(model, X, y, cv=5).mean()
return scorePro Tip: Always prefix conditional parameter names with the model type (e.g., svm_C, rf_max_depth). This prevents type conflicts and makes the parameter importance visualization readable.
Fix 4: Framework Integration — XGBoost, LightGBM, PyTorch
Optuna provides built-in integration callbacks for popular ML frameworks. These handle pruning automatically — no manual trial.report() and trial.should_prune() needed.
XGBoost integration:
import optuna
import xgboost as xgb
def objective(trial):
params = {
'objective': 'binary:logistic',
'eval_metric': 'logloss',
'max_depth': trial.suggest_int('max_depth', 3, 10),
'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
'subsample': trial.suggest_float('subsample', 0.5, 1.0),
'colsample_bytree': trial.suggest_float('colsample_bytree', 0.5, 1.0),
'device': 'cpu',
}
dtrain = xgb.DMatrix(X_train, label=y_train)
dval = xgb.DMatrix(X_val, label=y_val)
# Optuna callback handles pruning automatically
model = xgb.train(
params, dtrain,
num_boost_round=1000,
evals=[(dval, 'validation')],
callbacks=[
optuna.integration.XGBoostPruningCallback(trial, 'validation-logloss'),
],
verbose_eval=False,
)
preds = model.predict(dval)
return log_loss(y_val, preds)LightGBM integration:
import optuna
import lightgbm as lgb
def objective(trial):
params = {
'objective': 'binary',
'metric': 'binary_logloss',
'num_leaves': trial.suggest_int('num_leaves', 15, 127),
'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
'min_child_samples': trial.suggest_int('min_child_samples', 5, 100),
'subsample': trial.suggest_float('subsample', 0.5, 1.0),
'verbose': -1,
}
dtrain = lgb.Dataset(X_train, label=y_train)
dval = lgb.Dataset(X_val, label=y_val, reference=dtrain)
model = lgb.train(
params, dtrain,
num_boost_round=1000,
valid_sets=[dval],
callbacks=[
optuna.integration.LightGBMPruningCallback(trial, 'binary_logloss'),
lgb.early_stopping(50),
lgb.log_evaluation(-1),
],
)
return model.best_score['valid_0']['binary_logloss']PyTorch integration:
import optuna
import torch
import torch.nn as nn
def objective(trial):
lr = trial.suggest_float("lr", 1e-5, 1e-2, log=True)
hidden_size = trial.suggest_int("hidden_size", 32, 256)
dropout = trial.suggest_float("dropout", 0.1, 0.5)
model = nn.Sequential(
nn.Linear(input_dim, hidden_size),
nn.ReLU(),
nn.Dropout(dropout),
nn.Linear(hidden_size, num_classes),
)
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
for epoch in range(100):
train(model, optimizer, train_loader)
val_loss = evaluate(model, val_loader)
trial.report(val_loss, epoch)
if trial.should_prune():
raise optuna.TrialPruned()
return val_lossFor PyTorch-specific training issues like CUDA OOM and gradient clipping that interact with Optuna’s trial-level training, see PyTorch not working. For XGBoost training patterns, see XGBoost not working. For LightGBM’s num_leaves tuning and early stopping callback changes, see LightGBM not working.
Fix 5: Best Trial Not Reproducible
You find optimal parameters but re-training with them gives different results.
Set random seeds everywhere:
import optuna
import numpy as np
import random
def objective(trial):
seed = 42 # Fixed seed for reproducibility
np.random.seed(seed)
random.seed(seed)
lr = trial.suggest_float("lr", 1e-4, 1e-1, log=True)
model = train_model(lr=lr, seed=seed)
return evaluate(model)
# Use a fixed sampler seed
study = optuna.create_study(
direction="minimize",
sampler=optuna.samplers.TPESampler(seed=42),
)
study.optimize(objective, n_trials=100)
# Reproduce the best trial
best = study.best_trial
print(f"Best params: {best.params}")
print(f"Best value: {best.value}")
# Re-train with best params
final_model = train_model(**best.params, seed=42)Extract and use best parameters:
# Get the best trial's parameters
best_params = study.best_params
print(best_params)
# {'learning_rate': 0.0342, 'max_depth': 7, 'subsample': 0.85}
# Re-train with best params
model = XGBClassifier(**best_params, n_estimators=1000)
model.fit(X_train, y_train, eval_set=[(X_val, y_val)],
callbacks=[xgb.callback.EarlyStopping(rounds=50)])Fix 6: Visualization and Analysis
Optuna’s built-in visualization requires plotly:
pip install optuna[visualization]
# or
pip install plotlyimport optuna.visualization as vis
# Optimization history
fig = vis.plot_optimization_history(study)
fig.show()
# Parameter importance (which params matter most)
fig = vis.plot_param_importances(study)
fig.show()
# Parameter relationships (contour plot)
fig = vis.plot_contour(study, params=["learning_rate", "max_depth"])
fig.show()
# Parallel coordinate plot
fig = vis.plot_parallel_coordinate(study)
fig.show()
# Slice plot — one parameter vs objective value
fig = vis.plot_slice(study, params=["learning_rate"])
fig.show()Export study results to DataFrame:
import optuna
df = study.trials_dataframe()
print(df[['number', 'value', 'params_learning_rate', 'params_max_depth', 'state']]
.sort_values('value')
.head(10))Common Mistake: Running plot_param_importances with fewer than 10 completed trials. The importance calculation needs enough data to be meaningful — run at least 20–50 trials before analyzing parameter importance.
Fix 7: Custom Samplers and Multi-Objective Optimization
Multi-objective optimization — optimize multiple metrics simultaneously:
import optuna
def objective(trial):
lr = trial.suggest_float("lr", 1e-4, 1e-1, log=True)
model = train_model(lr=lr)
accuracy = evaluate_accuracy(model)
latency = evaluate_latency(model)
return accuracy, latency # Return tuple of objectives
study = optuna.create_study(
directions=["maximize", "minimize"], # Maximize accuracy, minimize latency
sampler=optuna.samplers.NSGAIISampler(seed=42),
)
study.optimize(objective, n_trials=100)
# Get the Pareto front — all non-dominated trials
pareto_trials = study.best_trials
for trial in pareto_trials:
print(f"Accuracy: {trial.values[0]:.4f}, Latency: {trial.values[1]:.2f}ms")
print(f"Params: {trial.params}")Grid search (exhaustive, not smart sampling):
search_space = {
"max_depth": [3, 5, 7, 9],
"learning_rate": [0.01, 0.05, 0.1],
}
study = optuna.create_study(
direction="minimize",
sampler=optuna.samplers.GridSampler(search_space),
)
study.optimize(objective, n_trials=12) # 4 × 3 = 12 combinationsStill Not Working?
Optuna Dashboard (Web UI)
pip install optuna-dashboard
# Launch dashboard pointing at your study database
optuna-dashboard sqlite:///optuna.db
# Opens http://localhost:8080 with interactive plotsIntegration with MLflow
Log Optuna trials as MLflow runs for persistent tracking:
import optuna
import mlflow
def objective(trial):
with mlflow.start_run(nested=True):
lr = trial.suggest_float("lr", 1e-4, 1e-1, log=True)
mlflow.log_param("lr", lr)
model = train_model(lr=lr)
score = evaluate(model)
mlflow.log_metric("score", score)
return score
with mlflow.start_run():
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=50)
mlflow.log_params(study.best_params)Timeout and Trial Limits
# Stop after 1 hour regardless of trial count
study.optimize(objective, timeout=3600)
# Stop after 100 trials
study.optimize(objective, n_trials=100)
# Stop when objective reaches a target
study.optimize(objective, n_trials=500)
# Then check: study.best_value < 0.01 to decide if more trials are neededsklearn Integration
For quick hyperparameter tuning of scikit-learn models without writing an objective function, use OptunaSearchCV:
from optuna.integration import OptunaSearchCV
from sklearn.ensemble import RandomForestClassifier
clf = OptunaSearchCV(
RandomForestClassifier(),
{
'n_estimators': optuna.distributions.IntDistribution(50, 500),
'max_depth': optuna.distributions.IntDistribution(3, 15),
},
cv=5,
n_trials=50,
scoring='roc_auc',
)
clf.fit(X_train, y_train)
print(clf.best_params_)For scikit-learn Pipeline and cross-validation patterns, see scikit-learn not working.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Gradio Not Working — Share Link, Queue Timeout, and Component Errors
How to fix Gradio errors — share link not working, queue timeout, component not updating, Blocks layout mistakes, flagging permission denied, file upload size limit, and HuggingFace Spaces deployment failures.
Fix: Jupyter Notebook Not Working — Kernel Dead, Module Not Found, and Widget Errors
How to fix Jupyter errors — kernel fails to start or dies, ModuleNotFoundError despite pip install, matplotlib plots not showing, ipywidgets not rendering in JupyterLab, port already in use, and jupyter command not found.
Fix: LightGBM Not Working — Installation Errors, Categorical Features, and Training Issues
How to fix LightGBM errors — ImportError libomp libgomp not found, do not support special JSON characters in feature name, categorical feature index out of range, num_leaves vs max_depth overfitting, early stopping callback changes, and GPU build errors.
Fix: MLflow Not Working — Tracking URI, Artifact Store, and Model Registry Errors
How to fix MLflow errors — no tracking server, artifact path not accessible, model version not found, experiment not found, MLFLOW_TRACKING_URI not set, autolog not recording metrics, and MLflow UI showing no runs.