Fix: MySQL Slow Query — Diagnosis and Optimization with EXPLAIN

Q: How do I fix "MySQL Slow Query — Diagnosis and Optimization with EXPLAIN"?

How to diagnose and fix slow MySQL queries — enabling the slow query log, reading EXPLAIN output, adding indexes, fixing N+1 queries, and optimizing JOINs and ORDER BY.

The Problem

A MySQL query runs much slower than expected:

SELECT * FROM orders WHERE customer_id = 42;
-- Takes 12 seconds on a table with 5 million rows

Or an API endpoint shows the database query is the bottleneck:

GET /api/orders?customer=42  — 12,340ms
  DB query: SELECT * FROM orders WHERE customer_id = 42  — 11,980ms

Or a query was fast in development but slow in production with real data:

Works fine locally (100 rows) → 15 seconds in production (5M rows)

Why This Happens

Slow MySQL queries almost always come down to a few root causes, but the tricky part is that these causes overlap and disguise each other. A query can have an index and still be slow. A table can be small and still cause timeouts if it is joined poorly. The symptoms rarely point directly to the real problem.

The most common root cause is a full table scan — MySQL reads every row because no usable index exists for the WHERE, JOIN, or ORDER BY clause. But developers often add an index and find the query is still slow, because the index column order does not match the query pattern, or the optimizer chooses a different plan due to stale statistics. This is why EXPLAIN output is essential: it shows you what MySQL actually does, not what you assume it does.

Here are the specific causes:

Full table scan (no index) — MySQL reads every row to find matching records. A table with 5 million rows scanned for a single value is inherently slow.
Wrong index or no index on JOIN/WHERE columns — a column appears in a WHERE, JOIN ON, or ORDER BY clause but has no index, forcing a sequential scan.
N+1 query pattern — the application makes one query to fetch N records, then N more queries to fetch related data for each, resulting in hundreds or thousands of round trips.
Selecting more data than needed — SELECT * fetches all columns including large TEXT/BLOB columns that are not used. Fetching unnecessary columns increases I/O.
Filesort on large result sets — ORDER BY on a column without an index causes MySQL to sort in memory (or on disk), which is slow for large datasets.
Outdated table statistics — MySQL’s query optimizer uses statistics to choose query plans. Stale statistics lead to poor plan choices like full table scans when an index would be faster.
Implicit type conversion — comparing a VARCHAR column to an integer causes MySQL to cast every value and ignore the index.

Diagnostic Timeline

Here is the step-by-step process an experienced DBA follows when a slow query report comes in.

Minute 0 — Identify the actual query. Do not guess. Pull the exact SQL from the slow query log, the application ORM logging, or performance_schema.events_statements_history. Copy it verbatim, including all parameters. A parameterized query from the ORM may look different from what you expect.

Minute 1 — First instinct: “add an index.” This is the most common reflex, and it is right about 60% of the time. But before creating an index, check whether one already exists. Run SHOW INDEX FROM orders;. If customer_id already has an index and the query is still slow, adding another index will not help. The problem is elsewhere.

Minute 2 — Run EXPLAIN. This is the discriminating step. EXPLAIN SELECT * FROM orders WHERE customer_id = 42; shows whether MySQL uses the index, how many rows it estimates scanning, and whether it resorts to a filesort or temporary table. Read the type, key, rows, and Extra columns. If type is ALL and key is NULL, there is no usable index. If type is ref and key shows your index but rows is still high, the index exists but the selectivity is poor (the value matches too many rows).

Minute 3 — Check the index column order. A composite index (status, created_at) works for WHERE status = 'active' AND created_at > '2026-01-01' but does not work for WHERE created_at > '2026-01-01' alone, because MySQL can only use a composite index from the leftmost column. If EXPLAIN shows possible_keys listing your index but key is NULL, the query pattern does not match the index prefix.

Minute 4 — Look at Extra for filesort and temporary. Using filesort means MySQL is sorting the result set in memory or on disk instead of reading it in index order. Using temporary means a temporary table is created for GROUP BY or DISTINCT. Both are expensive on large result sets. If the query already uses an index for the WHERE clause but has Using filesort in Extra, you may need a composite index that covers both the filter and the sort.

Minute 5 — Run EXPLAIN ANALYZE (MySQL 8.0+). This shows actual execution times and row counts, not just estimates. Compare estimated rows vs. actual rows. If the estimate is 100 but the actual is 500,000, the table statistics are stale. Run ANALYZE TABLE orders; and re-run the query.

Minute 6 — Check for N+1 at the application layer. If EXPLAIN looks fine (index used, few rows scanned) but the endpoint is still slow, the problem may be volume. Enable the general query log briefly, hit the endpoint, and count how many queries fire. If you see 200 identical queries with different parameter values, the application is making N+1 calls.

Step 1: Enable and Check the Slow Query Log

The slow query log records queries that exceed a time threshold. Enable it to find the worst offenders:

-- Check current slow query log settings
SHOW VARIABLES LIKE 'slow_query%';
SHOW VARIABLES LIKE 'long_query_time';

-- Enable slow query log (session or global)
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;  -- Log queries taking > 1 second
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow.log';

-- Also log queries that don't use indexes
SET GLOBAL log_queries_not_using_indexes = 'ON';

Or enable permanently in my.cnf:

[mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
log_queries_not_using_indexes = 1

Analyze the slow query log with mysqldumpslow:

# Top 10 slowest queries
mysqldumpslow -s t -t 10 /var/log/mysql/slow.log

# Top 10 most frequent slow queries
mysqldumpslow -s c -t 10 /var/log/mysql/slow.log

# Queries taking more than 5 seconds
mysqldumpslow -t 10 -l 5 /var/log/mysql/slow.log

Use pt-query-digest (Percona Toolkit) for richer analysis:

pt-query-digest /var/log/mysql/slow.log | head -100

Fix 1: Read EXPLAIN Output

EXPLAIN shows how MySQL executes a query — which indexes it uses, how many rows it scans, and which join type it applies:

EXPLAIN SELECT * FROM orders WHERE customer_id = 42;

Sample output (bad — full table scan):

+----+-------------+--------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table  | type | possible_keys | key  | key_len | ref  | rows    | Extra       |
+----+-------------+--------+------+---------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | orders | ALL  | NULL          | NULL | NULL    | NULL | 4987234 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+---------+-------------+

Key columns to interpret:

Column	What to look for
`type`	`ALL` = full table scan (bad). `ref`, `eq_ref`, `range` = index used (good). `const` = single row lookup (best).
`key`	The index MySQL chose. `NULL` means no index was used.
`rows`	Estimated rows scanned. High numbers with few result rows = inefficiency.
`Extra`	`Using filesort` = slow sort without index. `Using temporary` = temp table created. `Using index` = index-only scan (fast).

After adding an index:

CREATE INDEX idx_orders_customer_id ON orders (customer_id);
EXPLAIN SELECT * FROM orders WHERE customer_id = 42;

+----+-------------+--------+------+------------------------+------------------------+---------+-------+------+-------+
| id | select_type | table  | type | possible_keys          | key                    | key_len | ref   | rows | Extra |
+----+-------------+--------+------+------------------------+------------------------+---------+-------+------+-------+
|  1 | SIMPLE      | orders | ref  | idx_orders_customer_id | idx_orders_customer_id | 4       | const |  150 | NULL  |
+----+-------------+--------+------+------------------------+------------------------+---------+-------+------+-------+

type: ref and rows: 150 — MySQL now scans 150 rows instead of 5 million.

Use EXPLAIN FORMAT=JSON for detailed cost estimates:

EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE customer_id = 42\G

Use EXPLAIN ANALYZE (MySQL 8.0+) for actual execution times alongside estimates:

EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42;
-- Shows actual rows, actual loops, and actual time per operation

Fix 2: Add the Right Indexes

Different query patterns need different index strategies:

Equality lookup — single column:

-- Slow: full scan
SELECT * FROM users WHERE email = '[email protected]';

-- Add index on the filter column
CREATE INDEX idx_users_email ON users (email);
-- Or for unique values:
CREATE UNIQUE INDEX idx_users_email ON users (email);

Composite index — multiple filter columns:

-- Query filters on status AND created_at
SELECT * FROM orders WHERE status = 'pending' AND created_at > NOW() - INTERVAL 7 DAY;

-- Order: equality columns first, range columns last
CREATE INDEX idx_orders_status_created ON orders (status, created_at);
-- MySQL can use this index for both the equality filter and the range filter

Common Mistake: Putting the range column first in a composite index. MySQL can use a composite index for prefix columns, but stops using it at the first range condition. (status, created_at) works for WHERE status = 'x' AND created_at > y. (created_at, status) only uses the index for created_at, not the equality filter on status.

Covering index — all needed columns in the index:

-- Query only needs id, status, total
SELECT id, status, total FROM orders WHERE customer_id = 42;

-- Covering index includes all selected columns
CREATE INDEX idx_orders_customer_covering ON orders (customer_id, status, total);
-- EXPLAIN Extra: "Using index" — no table row access at all

Partial index (prefix index for TEXT/VARCHAR):

-- Index only the first 20 characters of a long URL column
CREATE INDEX idx_posts_url ON posts (url(20));
-- Useful when full-column indexing is too large

Remove redundant indexes. Multiple overlapping indexes waste write performance:

-- Find duplicate/redundant indexes
SELECT * FROM sys.schema_redundant_indexes;

-- Find unused indexes (MySQL 8.0 with performance_schema)
SELECT * FROM sys.schema_unused_indexes;

Fix 3: Fix N+1 Query Problems

N+1 is the most common application-level performance issue. It is invisible in EXPLAIN because each individual query is fast — the problem is volume:

# N+1 in Python — 1 query + 100 separate queries = 101 total
customers = db.execute("SELECT * FROM customers LIMIT 100").fetchall()
for customer in customers:
    orders = db.execute(
        "SELECT * FROM orders WHERE customer_id = %s", (customer['id'],)
    ).fetchall()
    customer['orders'] = orders

Fix — batch the second query:

# 2 queries total — fetch all related records at once
customers = db.execute("SELECT * FROM customers LIMIT 100").fetchall()
customer_ids = [c['id'] for c in customers]

# Use IN to fetch all orders in one query
placeholders = ','.join(['%s'] * len(customer_ids))
orders = db.execute(
    f"SELECT * FROM orders WHERE customer_id IN ({placeholders})",
    customer_ids
).fetchall()

# Group orders by customer_id in Python
from collections import defaultdict
orders_by_customer = defaultdict(list)
for order in orders:
    orders_by_customer[order['customer_id']].append(order)

for customer in customers:
    customer['orders'] = orders_by_customer[customer['id']]

Or use a JOIN:

SELECT c.id, c.name, o.id AS order_id, o.total, o.status
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
LIMIT 100;

In SQLAlchemy:

# N+1 — accessing relationship without eager loading
customers = session.query(Customer).limit(100).all()
for c in customers:
    print(c.orders)  # Each access fires a new query

# Fix — eager load with joinedload
from sqlalchemy.orm import joinedload
customers = (
    session.query(Customer)
    .options(joinedload(Customer.orders))
    .limit(100)
    .all()
)
# One query with JOIN — no extra queries when accessing c.orders

Detect N+1 with query logging. Enable MySQL query log and look for repeated queries with different parameter values:

SET GLOBAL general_log = 'ON';
SET GLOBAL general_log_file = '/var/log/mysql/general.log';
-- Run your application, then check the log for repeated patterns
SET GLOBAL general_log = 'OFF';

Fix 4: Fix Slow ORDER BY and GROUP BY

Sorting without an index causes a filesort — MySQL sorts all matching rows in memory or on disk:

EXPLAIN SELECT * FROM posts ORDER BY created_at DESC LIMIT 20;
-- Extra: Using filesort  — slow for large tables

Add an index that matches the sort order:

CREATE INDEX idx_posts_created_at ON posts (created_at DESC);

EXPLAIN SELECT * FROM posts ORDER BY created_at DESC LIMIT 20;
-- Extra: NULL  — index used for ordering, no filesort

For filtered + sorted queries, the index must cover both:

-- Query: filter on status, sort by created_at
SELECT * FROM posts WHERE status = 'published' ORDER BY created_at DESC LIMIT 20;

-- Index must include both columns in the right order
CREATE INDEX idx_posts_status_created ON posts (status, created_at DESC);
-- Now MySQL uses the index for both the WHERE and ORDER BY

Avoid ORDER BY RAND() — it is extremely slow on large tables because it assigns a random value to every row, then sorts all of them:

-- SLOW — scans entire table, assigns random value, sorts
SELECT * FROM posts ORDER BY RAND() LIMIT 5;

-- FAST — efficient random sampling for large tables
SELECT * FROM posts
WHERE id >= (SELECT FLOOR(RAND() * (SELECT MAX(id) FROM posts)))
ORDER BY id
LIMIT 5;

For GROUP BY queries, ensure the grouped column is indexed:

-- Slow GROUP BY — no index on category
SELECT category, COUNT(*) FROM products GROUP BY category;

-- Add index
CREATE INDEX idx_products_category ON products (category);

Fix 5: Optimize Pagination — Avoid Large OFFSET

OFFSET for pagination forces MySQL to read and discard all rows before the offset position:

-- SLOW — must read and discard 100,000 rows
SELECT * FROM posts ORDER BY id DESC LIMIT 20 OFFSET 100000;

Use keyset (cursor-based) pagination instead:

-- First page
SELECT * FROM posts ORDER BY id DESC LIMIT 20;
-- Returns rows with IDs e.g. 5000 down to 4981

-- Next page — use the last seen ID as cursor
SELECT * FROM posts WHERE id < 4981 ORDER BY id DESC LIMIT 20;
-- MySQL uses the index on id — no rows discarded

For multi-column sort, include all sort columns in the cursor:

-- Sort by created_at DESC, then id DESC for tie-breaking
SELECT * FROM posts
WHERE (created_at, id) < ($last_created_at, $last_id)
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- Index: (created_at DESC, id DESC)
CREATE INDEX idx_posts_cursor ON posts (created_at DESC, id DESC);

Real-world scenario: An e-commerce site with 500,000 products used OFFSET pagination. Page 1000 (OFFSET 20000) took 2 seconds. Switching to keyset pagination reduced it to 8ms — the index jump is instantaneous regardless of page depth.

Fix 6: Fix Implicit Type Conversion

When the data type of a query parameter does not match the column type, MySQL casts every row and ignores the index:

-- Column: user_id VARCHAR(20), indexed
-- Query passes an integer — type mismatch!
SELECT * FROM sessions WHERE user_id = 12345;
-- EXPLAIN type: ALL — index ignored due to type conversion

-- Fix: match the column type
SELECT * FROM sessions WHERE user_id = '12345';
-- EXPLAIN type: ref — index used

Check column types before writing queries:

DESCRIBE sessions;
-- user_id: varchar(20)
-- Always pass strings for varchar columns

With ORMs, ensure parameter types match:

# SQLAlchemy — Python int vs VARCHAR column
# Wrong
session.query(Session).filter(Session.user_id == 12345)

# Correct
session.query(Session).filter(Session.user_id == str(12345))

Fix 7: Use performance_schema for Real-Time Monitoring

MySQL’s performance_schema shows which queries are consuming the most resources right now:

-- Find the slowest queries currently running
SELECT * FROM information_schema.PROCESSLIST
WHERE TIME > 5
ORDER BY TIME DESC;

-- Kill a specific slow query
KILL QUERY <process_id>;

-- Find queries with most total execution time (performance_schema)
SELECT
  SCHEMA_NAME,
  DIGEST_TEXT,
  COUNT_STAR AS calls,
  ROUND(SUM_TIMER_WAIT / 1e12, 2) AS total_seconds,
  ROUND(AVG_TIMER_WAIT / 1e9, 2) AS avg_ms
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 20;

Check table I/O hotspots:

SELECT
  OBJECT_SCHEMA,
  OBJECT_NAME,
  COUNT_READ,
  COUNT_WRITE,
  ROUND(SUM_TIMER_READ / 1e12, 2) AS read_seconds,
  ROUND(SUM_TIMER_WRITE / 1e12, 2) AS write_seconds
FROM performance_schema.table_io_waits_summary_by_table
ORDER BY SUM_TIMER_READ + SUM_TIMER_WRITE DESC
LIMIT 10;

Use sys schema shortcuts (MySQL 5.7+):

-- Top 10 slowest statements
SELECT * FROM sys.statement_analysis ORDER BY avg_latency DESC LIMIT 10;

-- Tables with most full table scans
SELECT * FROM sys.schema_tables_with_full_table_scans LIMIT 10;

-- Index usage stats
SELECT * FROM sys.schema_index_statistics ORDER BY rows_selected DESC LIMIT 20;

Still Not Working?

Force MySQL to use a specific index to test whether the index is actually helping:

-- Force index use
SELECT * FROM orders USE INDEX (idx_orders_customer_id) WHERE customer_id = 42;

-- Ignore a specific index (test without it)
SELECT * FROM orders IGNORE INDEX (idx_orders_customer_id) WHERE customer_id = 42;

Update table statistics if the query planner makes poor choices:

ANALYZE TABLE orders;
-- Forces MySQL to recompute statistics for better plan decisions

Check if the table needs optimization after many deletes/updates (fragmentation):

OPTIMIZE TABLE orders;
-- Rebuilds the table, reclaims space, updates statistics
-- Note: locks the table during optimization on MyISAM; InnoDB uses online rebuild

Increase the InnoDB buffer pool for frequently accessed data. If the buffer pool is too small, MySQL reads from disk on every query:

# my.cnf — set to 70-80% of available RAM
[mysqld]
innodb_buffer_pool_size = 4G

-- Check buffer pool hit rate (should be > 99%)
SELECT
  (1 - (Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests)) * 100
  AS buffer_pool_hit_rate
FROM (
  SELECT
    SUM(VARIABLE_VALUE) AS Innodb_buffer_pool_reads
  FROM performance_schema.global_status
  WHERE VARIABLE_NAME = 'Innodb_buffer_pool_reads'
) reads,
(
  SELECT
    SUM(VARIABLE_VALUE) AS Innodb_buffer_pool_read_requests
  FROM performance_schema.global_status
  WHERE VARIABLE_NAME = 'Innodb_buffer_pool_read_requests'
) read_requests;

Check for lock contention if queries are fast alone but slow under concurrent load:

-- Show current locks and waiting queries
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

-- MySQL 8.0+
SELECT * FROM performance_schema.data_locks;
SELECT * FROM performance_schema.data_lock_waits;

Check the join buffer size. If a JOIN query scans many rows, the join buffer may overflow to disk. This typically shows up as a query that is fast on small result sets but slow when the result grows. Increase join_buffer_size cautiously — setting it too high per-connection can exhaust memory under concurrent load:

SHOW VARIABLES LIKE 'join_buffer_size';
-- Default: 256KB. For large analytical joins, try 4-8MB:
SET SESSION join_buffer_size = 4 * 1024 * 1024;

Examine whether SELECT * is pulling large BLOB/TEXT columns. If the table has a 50KB TEXT column you do not need, explicitly list the columns you want. This reduces I/O dramatically when the table has millions of rows:

-- Instead of SELECT *
SELECT id, customer_id, status, total, created_at FROM orders WHERE customer_id = 42;

Check the query cache status (MySQL 5.7 and earlier). The query cache was removed in MySQL 8.0 because it caused contention under write-heavy workloads. If you are on 5.7 and the cache is enabled, a heavily written table invalidates cached queries constantly, making the cache overhead worse than no cache at all. Consider disabling it:

-- MySQL 5.7
SHOW VARIABLES LIKE 'query_cache_type';
-- If ON, check hit ratio:
SHOW STATUS LIKE 'Qcache%';
-- If Qcache_hits is low relative to Qcache_inserts, disable it