MySQL Query Performance: Not Just Indexes

By   | January 30, 2018 |

MySQL Query PerformanceIn this blog post, I’ll look at whether optimizing indexing is always the key to improving MySQL query performance (spoiler, it isn’t).

As we look at MySQL query performance, our first concern is often whether a query is using the right indexes to retrieve the data. This is based on the assumption that finding the data is the most expensive operation – and the one you should focus on for MySQL query optimization. However, this is not always the case.

Let’s look at this query for illustration:

Looking at this query, many might assume the main problem is that this query is doing a full table scan. One could wonder then, “Why does the MySQL optimizer not use index (k)?”  (It is because the clause is not selective enough, by the way.) This thought might cause someone to force using the index, and get even worse performance:

Or someone might extend the index on (k) to (k,g) to be a covering index for this query. This won’t improve performance either:

This wasted effort is all due to focusing on the wrong thing: figuring out how can we find all the rows that match k<1000000  as soon as possible. This is not the problem in this case. In fact, the query that touches all the same columns but doesn’t use GROUP BY runs 10 times as fast:

For this particular query, whether or not it is using the index for lookup should not be the main question. Instead, we should look at how to optimize GROUP BY – which is responsible for some 90% of the query response time.

In my next blog post, I will write about how MySQL performs the GROUP BY operation to provide further help on optimizing these queries.

Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master’s Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Sponsored by Percona

You may also like...