users
フィールドid
とを持つテーブルがありますemail
。id
は主キーであり、email
インデックスも作成されます。
database> \d users
+-----------------------------+-----------------------------+-----------------------------------------------------+
| Column | Type | Modifiers |
|-----------------------------+-----------------------------+-----------------------------------------------------|
| id | integer | not null default nextval('users_id_seq'::regclass) |
| email | character varying | |
+-----------------------------+-----------------------------+-----------------------------------------------------+
Indexes:
"users_pkey" PRIMARY KEY, btree (id)
"index_users_on_email" UNIQUE, btree (email)
サブクエリで句を使用してテーブルをクエリすると、distinct on (email)
パフォーマンスが大幅に低下します。
database> explain (analyze, buffers)
select
id
from (
select distinct on (email)
id
from
users
) as t
where id = 123
+-----------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|-----------------------------------------------------------------------------------------------------------------------------|
| Subquery Scan on t (cost=8898.69..10077.84 rows=337 width=4) (actual time=221.133..250.782 rows=1 loops=1) |
| Filter: (t.id = 123) |
| Rows Removed by Filter: 67379 |
| Buffers: shared hit=2824, temp read=288 written=289 |
| -> Unique (cost=8898.69..9235.59 rows=67380 width=24) (actual time=221.121..247.582 rows=67380 loops=1) |
| Buffers: shared hit=2824, temp read=288 written=289 |
| -> Sort (cost=8898.69..9067.14 rows=67380 width=24) (actual time=221.120..239.573 rows=67380 loops=1) |
| Sort Key: users.email |
| Sort Method: external merge Disk: 2304kB |
| Buffers: shared hit=2824, temp read=288 written=289 |
| -> Seq Scan on users (cost=0.00..3494.80 rows=67380 width=24) (actual time=0.009..9.714 rows=67380 loops=1) |
| Buffers: shared hit=2821 |
| Planning Time: 0.243 ms |
| Execution Time: 251.258 ms |
+-----------------------------------------------------------------------------------------------------------------------------+
これをdistinct on (id)
、コストが前のクエリの 1000 分の 1 未満のクエリと比較します。
database> explain (analyze, buffers)
select
id
from (
select distinct on (id)
id
from
users
) as t
where id = 123
+-----------------------------------------------------------------------------------------------------------------------------+
| QUERY PLAN |
|-----------------------------------------------------------------------------------------------------------------------------|
| Unique (cost=0.29..8.31 rows=1 width=4) (actual time=0.021..0.022 rows=1 loops=1) |
| Buffers: shared hit=3 |
| -> Index Only Scan using users_pkey on users (cost=0.29..8.31 rows=1 width=4) (actual time=0.020..0.020 rows=1 loops=1) |
| Index Cond: (id = 123) |
| Heap Fetches: 1 |
| Buffers: shared hit=3 |
| Planning Time: 0.090 ms |
| Execution Time: 0.034 ms |
+-----------------------------------------------------------------------------------------------------------------------------+
どうしてこれなの?
私が抱えている本当の問題はdistinct on
、一意ではないインデックス付きの列を実行するビューを作成しようとしていて、パフォーマンスが非常に悪いことです。