MySQL Visual Explain

Understanding MySQL EXPLAIN Output

It doesn't matter how good a software developer you are. At some point, you and I have written slow queries. Sometimes, you've noticed them being slow immediately, while other times, they were fast enough until your data grew a lot. That's when you had to reach for MySQL's EXPLAIN command to understand what was happening and how to fix it. Because trying to fix a problem without knowing the root cause would be insane. We've never done that before. Riiiight?

But the EXPLAIN output is so complicated and incomprehensible for ordinary mortals. You could try to get an understandable explanation of the EXPLAIN output by your favorite AI. Yet they all have a common problem with database stuff and produce bad output, although they are improving with insane speeds in many areas: They are trained on available content and that is the problem: The existing content they use to train an AI is bad! The MySQL documentation is overly technical and describes everything in a way only understandable to database professionals. Likewise, all blog articles just replicate the technical language of the MySQL documentation without translating it into more developer-friendly descriptions.

The Mechanics

You can prepend the EXPLAIN clause to any SELECT , DELETE , INSERT , REPLACE and UPDATE query and run it. Such a query is not really executed - contrary to common belief! It is an diagnostics command with the primary function to provide insights into how MySQL would execute your (slow) query. So MySQL stops its internal workflow right before accessing your indexes and tables to report information like:

EXPLAIN SELECT * FROM table_name WHERE conditions

Constraints and Limitations

The EXPLAIN command exposes a marvelous treasure of information if you know how to read it. But it is also quite limited in how it does things and what it can do:

The Output

Lets run a simple query and go through the different information of the EXPLAIN output.

EXPLAIN SELECT first_name, last_name
FROM actor
WHERE actor_id IN (
    SELECT actor_id
    FROM film_actor
    WHERE film_id IN (
        SELECT film_id FROM film WHERE title like '%alone%'
    )
);
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE actor ALL PRIMARY 200 100.00
1 SIMPLE <subquery2> eq_ref <auto_distinct_key> <auto_distinct_key> 2 sakila.actor.actor_id 1 100.00
2 MATERIALIZED film index PRIMARY idx_title 514 1000 11.11 Using where; Using index
2 MATERIALIZED film_actor ref PRIMARY,idx_fk_film_id idx_fk_film_id 2 sakila.film.film_id 5 100.00 Using index

id

The id column is a sequential counter of the SELECT used within your query. It starts with one and is incremented for every further SELECT of a subquery. The idea is to link, e.g., a table accessed many times by different subqueries in the EXPLAIN output with the exact position in your query.

However, the value is only correct when compared with the transformed query MySQL uses internally. The one used in the example may look quite similar to this:

/* select#1 */
EXPLAIN SELECT actor.first_name AS first_name, actor.last_name AS last_name
FROM actor
JOIN (
    /* select#2 */
    SELECT DISTINCT actor_id
    FROM film_actor
    JOIN film ON (film.film_id = film_actor.film_id)
    WHERE film.title LIKE '%alone%'
) AS `<subquery2>` USING(actor_id)

So the id value is useless for the intended purpose as the query you've written differs from the one executed. But it also reveals another critical information about the order in which tables are accessed:

select_type

The select_type is always SIMPLE if your query is only using e.g. a few joins and conditions and no subqueries or unions. However, you start seeing different values when using these advanced features. The outermost part of your query will always have the PRIMARY value, and the subqueries will have different values based on how they are executed.

/* PRIMARY */
SELECT *
FROM actor
JOIN film_actor USING(actor_id)
WHERE film_id IN(
    /* SUBQUERY */
    SELECT film_id FROM film WHERE title LIKE '%DINOSAUR%'
)

For subqueries, you can see these select types:

When using unions, you will encounter these select types:

table

The table column is self-explanatory: It contains the name of the table being accessed or the table alias used in your query. But there are also some special values:

partitions

This column is only filled when a table is partitioned into smaller chunks. It lists all the partitions that will be accessed to find the rows matching the query's criteria.

type

The type column is essential as it describes how a table was accessed. You'll find information about whether an index is used and which way it is used. Most of these values are just technical information exported in the EXPLAIN output and are not interesting. However, some of them hint at inefficient index usage:

possible_keys

One of the most important ones: MySQL tells you which indexes it could use to get the rows from the table. Is the index you expected to use missing from this list? The most common reason is that you transformed an indexed column and believe the index will still be used - it won't!

SELECT * FROM invoices WHERE YEAR(created_at) = 2024

You have to create a specialized index for this transformation:

CREATE INDEX year ON invoices ((YEAR(created_at)))

key

The key column tells you which index is used from the possible ones to load the rows - or NULL if none is used. A common thing also listed is automatically generated keys (e.g. <auto_distinct_key>) when MySQL stores a subquery's result in a temporary table and wants to access if fast in further execution steps. And rarely you see multiple keys listed when MySQL decides that one would not be efficient enough (see index_merge access type).

key_len

If you really want to dig deep down into your EXPLAIN output, you can analyze the key_len column: It lists the bytes used from the indexes column definition. For multi-column indexes, you could infer whether all columns in the index had been used to find the rows or the subset of columns used.

ref

Theoretically, the ref column should show the columns or values compared to the indexed columns. But you will most often see const or similar labels there. They are not of any interest to you.

rows

MySQL shares its estimation of rows that will be loaded from the table with the rows column. Remember that this is only an estimate and not the actual value! However, this estimate is also the foundation of MySQL's decision-making process regarding which index to use. With estimates far off the actual value, MySQL may choose the wrong index or even use none.

The displayed number is also not the real estimate. Its a per-step estimate: For e.g. a join the number of loaded rows for each iteration of the loop is reported. So, with multiple joins or subqueries, the value may need to be multiplicated many times.

filtered

The filtered column indicates the percentage of rows loaded from table (rows column) left after executing all filter conditions. So 1900 rows remain when 2000 rows have been loaded and the filter value is 95.00. This value should be near the optimum of 100.00. A more fitting index should be created whenever you see a low filter value.

Extra

The Extra column has many different values (35+) that export more low-level information about the query processing. But these values will most likely not be of any interest to you as they only tell which internal optimizations have been applied—they are meant for database professionals.

Conclusion

You get all the diagnostic information about a slow query with MySQL's EXPLAIN command to create a better index. It is the most important information source and hard to understand at first but it gets easier with more experience.

However, you shouldn't have to learn all these database internals! With mysqlexplain.com, you can get a visual representation of your query that is much easier to understand when you need to optimize a query's performance.