有一张流水表目前的数据量为900W+,分页查询使用到了limit,优化之前的查询耗时16 s 左右,优化SQL后,耗时347 ms 

查询条件放到子查询中,子查询只查主键ID,然后使用子查询中确定的主键关联查询其他的属性字段
原理: 减少回表操作
-- 优化前
SELECT 各种字段
FROM `TABLE_NAME`
WHERE 各种条件
LIMIT 0,10;
-- 优化后
SELECT 各种字段
FROM `TABLE_NAME` MAIN_TALE
RIGHT JOIN 
(SELECT 子查询只查主键
FROM `TABLE_NAME`
WHERE 各种条件
LIMIT 0,10;) TEMP_TABLE ON TEMP_TABLE.主键 = MAIN_TABLE.主键

建表进行测试分析一下

说明一下MySQL的版本:

mysql> SELECT VERSION();
+-----------+
| version() |
+-----------+
| 5.7.17    |
+-----------+
1 row in set (0.00 sec)

新建一个TEST表,表结构:

mysql> DESC TEST;
+--------+---------------------+------+-----+---------+----------------+
| Field  | Type                | Null | Key | Default | Extra          |
+--------+---------------------+------+-----+---------+----------------+
| id     | bigint(20) unsigned | NO   | PRI | NULL    | auto_increment |
| val    | int(10) unsigned    | NO   | MUL | 0       |                |
| source | int(10) unsigned    | NO   |     | 0       |                |
+--------+---------------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)

id为自增主键,val为非唯一索引。

灌入大量数据,共500万:

mysql> SELECT COUNT(*) FROM TEST;
+----------+
| count(*) |
+----------+
|  5242882 |
+----------+
1 row in set (4.25 sec)

当limit offset rows中的offset很大时,会出现效率问题:

mysql> SELECT * FROM TEST WHERE VAL=4 LIMIT 300000,5;
+---------+-----+--------+
| id      | val | source |
+---------+-----+--------+
| 3327622 |   4 |      4 |
| 3327632 |   4 |      4 |
| 3327642 |   4 |      4 |
| 3327652 |   4 |      4 |
| 3327662 |   4 |      4 |
+---------+-----+--------+
5 rows in set (15.98 sec)

改写成如下语句:

mysql> SELECT * FROM TEST A RIGHT JOIN (SELECT ID FROM TEST WHERE VAL=4 LIMIT 300000,5) B ON A.ID=B.ID;
+---------+-----+--------+---------+
| id      | val | source | id      |
+---------+-----+--------+---------+
| 3327622 |   4 |      4 | 3327622 |
| 3327632 |   4 |      4 | 3327632 
| 3327642 |   4 |      4 | 3327642 |
| 3327652 |   4 |      4 | 3327652 |
| 3327662 |   4 |      4 | 3327662 |
+---------+-----+--------+---------+
5 rows in set (0.38 sec)

时间差距非常明显

我们看一下SELECT * FROM TEST WHERE VAL=4 LIMIT 300000,5的查询过程:

查询到索引叶子节点数据根据叶子节点上的主键值去聚簇索引上查询需要的全部字段值。

类似于下面这张图:

像上面这样,需要查询300005次索引节点,查询300005次聚簇索引的数据,最后将结果过滤掉前300000条,取出最后5条,这样就耗费了大量随机I/O

InnoDB中有buffer pool。里面存有最近访问过的数据页,包括数据页和索引页。所以我们需要运行两个sql,来比较buffer pool中的数据页的数量

SELECT * FROM TEST WHERE VAL=4 LIMIT 300000,5
mysql> SELECT INDEX_NAME,COUNT(*)
    -> FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE
    -> WHERE INDEX_NAME IN('VAL','PRIMARY')
    -> AND TABLE_NAME LIKE '%TEST%'
    -> GROUP BY INDEX_NAME;
Empty set (0.04 sec)

可以看出,目前buffer pool中没有关于test表的数据页

运行一下原始sql

mysql> SELECT * FROM TEST WHERE VAL=4 LIMIT 300000,5;
+---------+-----+--------+
| id      | val | source |
+---------+-----+--------+|
3327622 |   4 |      4 |
| 3327632 |   4 |      4 |
| 3327642 |   4 |      4 |
| 3327652 |   4 |      4 |
| 3327662 |   4 |      4 |
+---------+-----+--------+
5 rows in set (26.19 sec)
mysql> SELECT INDEX_NAME,COUNT(*)
    -> FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE
    -> WHERE INDEX_NAME IN('VAL','PRIMARY')
    -> AND TABLE_NAME LIKE '%TEST%'
    -> GROUP BY INDEX_NAME;
+------------+----------+
| index_name | count(*) |
+------------+----------+
| PRIMARY    |     4098 |
| val        |      208 |
+------------+----------+2 rows in set (0.04 sec)

此时buffer pool中关于TEST表有4098个数据页,208个索引页

现在需要清空buffer pool,重启mysql。

mysql> SELECT INDEX_NAME,COUNT(*)
    -> FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE
    -> WHERE INDEX_NAME IN('VAL','PRIMARY')
    -> AND TABLE_NAME LIKE '%TEST%'
    -> GROUP BY INDEX_NAME;
Empty set (0.03 sec)

运行优化后的sql

mysql> SELECT * FROM TEST A INNER JOIN (SELECT ID FROM TEST WHERE VAL=4 LIMIT 300000,5) B ON A.ID=B.ID;
+---------+-----+--------+---------+
| id      | val | source | id      |
+---------+-----+--------+---------+
| 3327622 |   4 |      4 | 3327622 |
| 3327632 |   4 |      4 | 3327632 |
| 3327642 |   4 |      4 | 3327642 |
| 3327652 |   4 |      4 | 3327652 |
| 3327662 |   4 |      4 | 3327662 |
+---------+-----+--------+---------+
5 rows in set (0.09 sec)
mysql> select index_name,count(*) from information_schema.INNODB_BUFFER_PAGE where INDEX_NAME in('val','primary') and TABLE_NAME like '%test%' group by index_name;
+------------+----------+
| index_name | count(*) |
+------------+----------+
| PRIMARY    |        5 |
| val        |      390 |
+------------+----------+
2 rows in set (0.03 sec)

可以明显的看出差距了:

第一个sql加载了4098个数据页到buffer pool,而第二个sql只加载了5个数据页到buffer pool

本文转载自:https://www.jianshu.com/p/0768ebc4e28d

作者:Muscleape

=

一沙一世界,一花一天堂。君掌盛无边,刹那成永恒。