MySQL Query Performance: Not Just Indexes

In this blog post, I’ll look at whether optimizing indexing is always the key to improving MySQL query performance (spoiler, it isn’t).

As we look at MySQL query performance, our first concern is often whether a query is using the right indexes to retrieve the data. This is based on the assumption that finding the data is the most expensive operation – and the one you should focus on for MySQL query optimization. However, this is not always the case.

Let’s look at this query for illustration:

mysql> show create table tbl G

*************************** 1. row ***************************

Table: tbl

Create Table: CREATE TABLE `tbl` (

`id` int(11) NOT NULL AUTO_INCREMENT,

`k` int(11) NOT NULL DEFAULT ‘0’,

`g` int(10) unsigned NOT NULL,

PRIMARY KEY (`id`),

KEY `k_1` (`k`)

) ENGINE=InnoDB AUTO_INCREMENT=2340933 DEFAULT CHARSET=latin1

1 row in set (0.00 sec)

mysql> explain select g,count(*) c from tbl where k<1000000 group by g having c>7 G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: tbl

partitions: NULL

type: ALL

possible_keys: k_1

key: NULL

key_len: NULL

ref: NULL

rows: 998490

filtered: 50.00

Extra: Using where; Using temporary; Using filesort

1 row in set, 1 warning (0.00 sec)

mysql> select g,count(*) c from tbl where k<1000000 group by g having c>7;

+————+——+

| g | c |

+————+——+

| 28846 | 8 |

| 139660 | 8 |

| 153286 | 8 |

...

| 934984 | 8 |

+————+——+

22 rows in set (6.80 sec)

Looking at this query, many might assume the main problem is that this query is doing a full table scan. One could wonder then, “Why does the MySQL optimizer not use index (k)?” (It is because the clause is not selective enough, by the way.) This thought might cause someone to force using the index, and get even worse performance:

mysql> select g,count(*) c from tbl force index(k) where k<1000000 group by g having c>7;

+————+——+

| g | c |

+————+——+

| 28846 | 8 |

| 139660 | 8 |

...

| 934984 | 8 |

+————+——+

22 rows in set (9.37 sec)

Or someone might extend the index on (k) to (k,g) to be a covering index for this query. This won’t improve performance either:

mysql> alter table tbl drop key k_1, add key(k,g);

Query OK, 0 rows affected (5.35 sec)

Records: 0 Duplicates: 0 Warnings: 0

mysql> explain select g,count(*) c from tbl where k<1000000 group by g having c>7 G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: tbl

partitions: NULL

type: range

possible_keys: k

key: k

key_len: 4

ref: NULL

rows: 499245

filtered: 100.00

Extra: Using where; Using index; Using temporary; Using filesort

1 row in set, 1 warning (0.00 sec)

mysql> select g,count(*) c from tbl where k<1000000 group by g having c>7;

+————+——+

| g | c |

+————+——+

| 28846 | 8 |

| 139660 | 8 |

...

| 915436 | 8 |

| 934984 | 8 |

+————+——+

22 rows in set (6.80 sec)

This wasted effort is all due to focusing on the wrong thing: figuring out how can we find all the rows that match k<1000000 as soon as possible. This is not the problem in this case. In fact, the query that touches all the same columns but doesn’t use GROUP BY runs 10 times as fast:

mysql> select sum(g) from tbl where k<1000000;

+———————+

| sum(g) |

+———————+

| 500383719481 |

+———————+

1 row in set (0.68 sec)

For this particular query, whether or not it is using the index for lookup should not be the main question. Instead, we should look at how to optimize GROUP BY – which is responsible for some 90% of the query response time.

In my next blog post, I will write about how MySQL performs the GROUP BY operation to provide further help on optimizing these queries.

Thoughts on MySQL 8.0 Invisible Indexes October 27, 2016 In “MySQL”

MySQL 5.6.10 Optimizer Limitations: Index Condition Pushdown March 14, 2013 In “Events and Announcements”

MySQL 8.0: Descending Indexes Can Speed Up Your Queries October 20, 2016 In “MySQL”

Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master’s Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Sponsored by Percona

MySQL Query Performance: Not Just Indexes

Related

You may also like...

Resources

Search

News

Events

Archives

Sponsored By

HPCC Systems from LexisNexis Risk Solutions

KX

InterSystems

MySQL/Oracle

SingleStore

Supporters

McObject

NEXTGRES

Progress

Raima

Scality

Volt Active Data