indexing - MySql - Self Join - Full Table Scan (Cannot Scan Index)

Question

Welcome To Ask or Share your Answers For Others

indexing - MySql - Self Join - Full Table Scan (Cannot Scan Index)

posted Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

indexing - MySql - Self Join - Full Table Scan (Cannot Scan Index)

I have the following self-join query:

SELECT A.id
FROM mytbl      AS A
LEFT JOIN mytbl AS B 
ON (A.lft BETWEEN B.lft AND B.rgt)

The query is quite slow, and after looking at the execution plan the cause appears to be a full table scan in the JOIN. The table has only 500 rows, and suspecting this to be the issue I increased it to 100,000 rows in order to see if it made a difference to the optimizer's selection. It did not, with 100k rows it was still doing a full table scan.

My next step was to try and force indexes with the following query, but the same situation arises, a full table scan:

SELECT A.id
FROM categories_nested_set      AS A
LEFT JOIN categories_nested_set AS B 
FORCE INDEX (idx_lft, idx_rgt)
ON (A.lft BETWEEN B.lft AND B.rgt)

All columns (id, lft, rgt) are integers, all are indexed.

Why is MySql doing a full table scan here?

How can I change my query to use indexes instead of a full table scan?

CREATE TABLE mytbl ( lft int(11) NOT NULL DEFAULT '0', 
 rgt int(11) DEFAULT NULL, 
 id int(11) DEFAULT NULL,
 category varchar(128) DEFAULT NULL,
  PRIMARY KEY (lft), 
  UNIQUE KEY id (id), 
  UNIQUE KEY rgt (rgt), 
  KEY idx_lft (lft), 
  KEY idx_rgt (rgt) ) ENGINE=InnoDB DEFAULT CHARSET=utf8

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2022-01-31T07:23:25+0000

You have lot's of indexes, some of them are redundant. Let's begin by clearing up some of them. Too many indexes slows down inserts and updates.

PRIMARY KEY (lft),
KEY idx_lft (lft),

Since you already have a primary key defined on lft, there is no need what so ever for another index on lft. Similarly with a unique index on rgt there is not need for the second index listed below.

UNIQUE KEY rgt (rgt), 
KEY idx_rgt (rgt)

Now let's look at your query.

SELECT A.id
FROM mytbl      AS A
LEFT JOIN mytbl AS B 
ON (A.lft BETWEEN B.lft AND B.rgt)

This is very unlikely to be a query that will be encountered in the wild. With 500 rows, this query may produce even 5000 rows? Do you really need the entire key created in one go? The reason that this query is slow is because mysql can only optimize range comparisions for constants. It's more likely that your actually query will look something like this:

SELECT B.*
FROM mytbl      AS A
LEFT JOIN mytbl AS B 
ON (A.lft BETWEEN B.lft AND B.rgt) 
WHERE a.id = N;

Where you create the node for a particular id. This will use indexes and will be really fast. What's the point of optimizing for a query that you will not use much if at all?

Categories

indexing - MySql - Self Join - Full Table Scan (Cannot Scan Index)

indexing - MySql - Self Join - Full Table Scan (Cannot Scan Index)

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags