postgresql - サブクエリで個別に使用すると PostgreSQL のパフォーマンスが低下するのはなぜですか?

Question

usersフィールドidとを持つテーブルがありますemail。idは主キーであり、emailインデックスも作成されます。

database> \d users
+-----------------------------+-----------------------------+-----------------------------------------------------+
| Column                      | Type                        | Modifiers                                           |
|-----------------------------+-----------------------------+-----------------------------------------------------|
| id                          | integer                     |  not null default nextval('users_id_seq'::regclass) |
| email                       | character varying           |                                                     |
+-----------------------------+-----------------------------+-----------------------------------------------------+
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
    "index_users_on_email" UNIQUE, btree (email)

サブクエリで句を使用してテーブルをクエリすると、distinct on (email)パフォーマンスが大幅に低下します。

database> explain (analyze, buffers)
   select
     id
   from (
     select distinct on (email)
       id
     from
       users
   ) as t
   where id = 123
+-----------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                  |
|-----------------------------------------------------------------------------------------------------------------------------|
| Subquery Scan on t  (cost=8898.69..10077.84 rows=337 width=4) (actual time=221.133..250.782 rows=1 loops=1)                 |
|   Filter: (t.id = 123)                                                                                                      |
|   Rows Removed by Filter: 67379                                                                                             |
|   Buffers: shared hit=2824, temp read=288 written=289                                                                       |
|   ->  Unique  (cost=8898.69..9235.59 rows=67380 width=24) (actual time=221.121..247.582 rows=67380 loops=1)                 |
|         Buffers: shared hit=2824, temp read=288 written=289                                                                 |
|         ->  Sort  (cost=8898.69..9067.14 rows=67380 width=24) (actual time=221.120..239.573 rows=67380 loops=1)             |
|               Sort Key: users.email                                                                                         |
|               Sort Method: external merge  Disk: 2304kB                                                                     |
|               Buffers: shared hit=2824, temp read=288 written=289                                                           |
|               ->  Seq Scan on users  (cost=0.00..3494.80 rows=67380 width=24) (actual time=0.009..9.714 rows=67380 loops=1) |
|                     Buffers: shared hit=2821                                                                                |
| Planning Time: 0.243 ms                                                                                                     |
| Execution Time: 251.258 ms                                                                                                  |
+-----------------------------------------------------------------------------------------------------------------------------+

これをdistinct on (id)、コストが前のクエリの 1000 分の 1 未満のクエリと比較します。

database> explain (analyze, buffers)
   select
     id
   from (
     select distinct on (id)
       id
     from
       users
   ) as t
   where id = 123
+-----------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN                                                                                                                  |
|-----------------------------------------------------------------------------------------------------------------------------|
| Unique  (cost=0.29..8.31 rows=1 width=4) (actual time=0.021..0.022 rows=1 loops=1)                                          |
|   Buffers: shared hit=3                                                                                                     |
|   ->  Index Only Scan using users_pkey on users  (cost=0.29..8.31 rows=1 width=4) (actual time=0.020..0.020 rows=1 loops=1) |
|         Index Cond: (id = 123)                                                                                              |
|         Heap Fetches: 1                                                                                                     |
|         Buffers: shared hit=3                                                                                               |
| Planning Time: 0.090 ms                                                                                                     |
| Execution Time: 0.034 ms                                                                                                    |
+-----------------------------------------------------------------------------------------------------------------------------+

どうしてこれなの？

私が抱えている本当の問題はdistinct on、一意ではないインデックス付きの列を実行するビューを作成しようとしていて、パフォーマンスが非常に悪いことです。

postgresql - サブクエリで個別に使用すると PostgreSQL のパフォーマンスが低下するのはなぜですか?

1 に答える 1

Related

Reference