Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexFullScan with higher cost is chosen to avoid sorting for window function #46177

Closed
pcqz opened this issue Aug 17, 2023 · 4 comments · Fixed by #48186
Closed

IndexFullScan with higher cost is chosen to avoid sorting for window function #46177

pcqz opened this issue Aug 17, 2023 · 4 comments · Fixed by #48186
Assignees
Labels

Comments

@pcqz
Copy link

pcqz commented Aug 17, 2023

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

mysql> explain format='verbose'  select row_number() over(order by a.k), a.* from (select * from sbtest1 where id<10) a;
+------------------------------------+------------+--------------+-----------+-----------------------------+--------------------------------------------------------------------------------------------------+
| id                                 | estRows    | estCost      | task      | access object               | operator info                                                                                    |
+------------------------------------+------------+--------------+-----------+-----------------------------+--------------------------------------------------------------------------------------------------+
| Projection_8                       | 2.85       | 16898882.30  | root      |                             | Column#6, sbtest1.sbtest1.id, sbtest1.sbtest1.k, sbtest1.sbtest1.c, sbtest1.sbtest1.pad          |
| └─Window_9                         | 2.85       | 16898880.88  | root      |                             | row_number()->Column#6 over(order by sbtest1.sbtest1.k rows between current row and current row) |
|   └─IndexLookUp_13                 | 2.85       | 16898880.88  | root      |                             |                                                                                                  |
|     ├─Selection_12(Build)          | 2.85       | 253400000.00 | cop[tikv] |                             | lt(sbtest1.sbtest1.id, 10)                                                                       |
|     │ └─IndexFullScan_10           | 1000000.00 | 203500000.00 | cop[tikv] | table:sbtest1, index:k_1(k) | keep order:true                                                                                  |
|     └─TableRowIDScan_11(Probe)     | 2.85       | 898.48       | cop[tikv] | table:sbtest1               | keep order:false                                                                                 |
+------------------------------------+------------+--------------+-----------+-----------------------------+--------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

2. What did you expect to see? (Required)

mysql> explain format='verbose'  select row_number() over(order by a.k), a.* from (select * from sbtest1 use index(primary) where id<10) a;
+-------------------------------+---------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------+
| id                            | estRows | estCost | task      | access object | operator info                                                                                    |
+-------------------------------+---------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------+
| Projection_8                  | 2.85    | 561.13  | root      |               | Column#6, sbtest1.sbtest1.id, sbtest1.sbtest1.k, sbtest1.sbtest1.c, sbtest1.sbtest1.pad          |
| └─Window_9                    | 2.85    | 559.71  | root      |               | row_number()->Column#6 over(order by sbtest1.sbtest1.k rows between current row and current row) |
|   └─Sort_12                   | 2.85    | 559.71  | root      |               | sbtest1.sbtest1.k                                                                                |
|     └─TableReader_11          | 2.85    | 222.33  | root      |               | data:TableRangeScan_10                                                                           |
|       └─TableRangeScan_10     | 2.85    | 898.48  | cop[tikv] | table:sbtest1 | range:[0,10), keep order:false                                                                   |
+-------------------------------+---------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)

3. What did you see instead (Required)

IndexFullScan with higher cost is chosen.

4. What is your TiDB version? (Required)

v7.1.0

@pcqz pcqz added the type/bug This issue is a bug. label Aug 17, 2023
@ti-chi-bot ti-chi-bot bot added may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels Aug 18, 2023
@seiya-annie
Copy link

/found gs

@ti-chi-bot ti-chi-bot bot added the found/gs found by gs label Sep 20, 2023
@qw4990
Copy link
Contributor

qw4990 commented Oct 13, 2023

A minimum reproducible case:

 CREATE TABLE `sbtest` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `k` int(10) unsigned NOT NULL DEFAULT '0',
  `c` char(120) NOT NULL DEFAULT '',
  `pad` char(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */,
  KEY `k` (`k`)
);

mysql> explain format='verbose'  select row_number() over(order by a.k) from (select * from sbtest where id<10) a;
+------------------------------+----------+------------+-----------+--------------------------+----------------------------------------------------------------------------------------------+
| id                           | estRows  | estCost    | task      | access object            | operator info                                                                                |
+------------------------------+----------+------------+-----------+--------------------------+----------------------------------------------------------------------------------------------+
| Projection_9                 | 10.00    | 169018.81  | root      |                          | Column#6->Column#7                                                                           |
| └─Window_10                  | 10.00    | 169017.81  | root      |                          | row_number()->Column#6 over(order by test.sbtest.k rows between current row and current row) |
|   └─IndexReader_13           | 10.00    | 169017.81  | root      |                          | index:Selection_12                                                                           |
|     └─Selection_12           | 10.00    | 2534000.00 | cop[tikv] |                          | lt(test.sbtest.id, 10)                                                                       |
|       └─IndexFullScan_11     | 10000.00 | 2035000.00 | cop[tikv] | table:sbtest, index:k(k) | keep order:true, stats:pseudo                                                                |
+------------------------------+----------+------------+-----------+--------------------------+----------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)


mysql> explain format='verbose'  select row_number() over(order by a.k) from (select * from sbtest use index(primary) where id<10) a;
+-------------------------------+---------+---------+-----------+---------------+----------------------------------------------------------------------------------------------+
| id                            | estRows | estCost | task      | access object | operator info                                                                                |
+-------------------------------+---------+---------+-----------+---------------+----------------------------------------------------------------------------------------------+
| Projection_9                  | 10.00   | 2008.16 | root      |               | Column#6->Column#7                                                                           |
| └─Window_10                   | 10.00   | 2007.16 | root      |               | row_number()->Column#6 over(order by test.sbtest.k rows between current row and current row) |
|   └─Sort_13                   | 10.00   | 2007.16 | root      |               | test.sbtest.k                                                                                |
|     └─TableReader_12          | 10.00   | 285.52  | root      |               | data:TableRangeScan_11                                                                       |
|       └─TableRangeScan_11     | 10.00   | 3015.62 | cop[tikv] | table:sbtest  | range:[0,10), keep order:false, stats:pseudo                                                 |
+-------------------------------+---------+---------+-----------+---------------+----------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)

@qw4990
Copy link
Contributor

qw4990 commented Oct 13, 2023

When generating Scan operators, if the upper operator needs ordered data, there are 2 choices:
Iterate just some of paths that can keep data in this order, generate plans.
Iterate other remaining paths, add an explicit Sort upon their plans to keep data in order.
Currently, the optimizer will try 1 first, if it can find any valid plan, it'll return and finish the process directly; otherwise, it'll try 2.
Now, the problem is plans generated from 2 may be better than 1 sometimes, but the optimzier finishes the process too early:
img_v2_eb82fca8-bc67-4542-8603-11bb6e0fcb1g

@qw4990
Copy link
Contributor

qw4990 commented Oct 13, 2023

Window and StreamAgg have this problem.
Another case about StreamAgg:
55d8f14a-407a-4e7f-88b3-05175b48f6f0

ti-chi-bot bot pushed a commit that referenced this issue Nov 24, 2023
@ti-chi-bot ti-chi-bot added the affects-5.4 This bug affects 5.4.x versions. label Nov 30, 2023
ti-chi-bot bot pushed a commit that referenced this issue Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants