時系列データがあり、可能な限り効率的な DB 構造とクエリを作成しようとしています。
ID と日時をテーブルの desc としてインデックス付けしました。
SELECT
table.id,
To_char(Time_bucket('2 hours', datetime) at time zone 'utc', 'YYYY-MM-DD"T"HH24:MI:SS"Z"') AS time,
Avg(value) AS value,
mapping.description
FROM
table
JOIN
mapping
ON table.id = mapping.id
WHERE
table.id IN
(
10000,
10004,
1001,
10005
)
AND datetime BETWEEN '2019-09-25' AND '2019-09-30'
GROUP BY
time,
table.id,
mapping.description
ORDER BY
time DESC;
以下のようなテーブル構造
Table "public.table"
Column | Type | Collation | Nullable | Default
----------+-----------------------------+-----------+----------+---------
datetime | timestamp without time zone | | not null |
id | integer | | not null |
value | double precision | | |
Indexes:
"table_datetime_idx" btree (datetime DESC)
"table_id_datetime_idx" btree (id, datetime DESC)
マッピング表
Table "public.mapping"
Column | Type | Collation | Nullable | Default
-------------+-------------------+-----------+----------+---------
id | integer | | not null |
tagname | character varying | | |
description | character varying | | |
unit | character varying | | |
mineu | double precision | | |
maxeu | double precision | | |
Indexes:
"mapping_id_idx" btree (id)
エラーはありませんが、これは見栄えも効率も十分ではないのではないかと思います。実行には約 14 秒かかります。このクエリを最適化する最も簡単なソリューションは何ですか?
EXPLAIN ANALYZEの結果の下
GroupAggregate (cost=250964.79..265699.28 rows=453369 width=73) (actual time=10247.641..11501.894 rows=60 loops=1)
Group Key: (to_char(timezone('utc'::text, time_bucket('02:00:00'::interval, _hyper_1_4_chunk.datetime)), 'YYYY-MM-DD"T"HH24:MI:SS"Z"'::text)), _hyper_1_4_chunk.id, mapping.description
-> Sort (cost=250964.79..252098.21 rows=453369 width=73) (actual time=10237.177..10481.057 rows=421712 loops=1)
Sort Key: (to_char(timezone('utc'::text, time_bucket('02:00:00'::interval, _hyper_1_4_chunk.datetime)), 'YYYY-MM-DD"T"HH24:MI:SS"Z"'::text)) DESC, _hyper_1_4_chunk.id, mapping.description
Sort Method: external merge Disk: 33816kB
-> Hash Join (cost=7228.67..196570.23 rows=453369 width=73) (actual time=81.488..5779.432 rows=421712 loops=1)
Hash Cond: (_hyper_1_4_chunk.id = mapping.id)
-> Append (cost=7215.89..186363.19 rows=452059 width=20) (actual time=81.299..3680.949 rows=421712 loops=1)
-> Bitmap Heap Scan on _hyper_1_4_chunk (cost=7215.89..129006.87 rows=363549 width=20) (actual time=81.298..3350.870 rows=336860 loops=1)
Recheck Cond: ((id = ANY ('{10000,10004,1001,10005}'::integer[])) AND (datetime >= '2019-09-25 00:00:00'::timestamp without time zone) AND (datetime <= '2019-09-30 00:00:00'::timestamp without time zone))
Heap Blocks: exact=61125
-> Bitmap Index Scan on _hyper_1_4_chunk_table_id_datetime_idx (cost=0.00..7125.00 rows=363549 width=0) (actual time=69.006..69.006 rows=336860 loops=1)
Index Cond: ((id = ANY ('{10000,10004,1001,10005}'::integer[])) AND (datetime >= '2019-09-25 00:00:00'::timestamp without time zone) AND (datetime <= '2019-09-30 00:00:00'::timestamp without time zone))
-> Bitmap Heap Scan on _hyper_1_3_chunk (cost=1766.52..57356.32 rows=88510 width=20) (actual time=20.876..311.867 rows=84852 loops=1)
Recheck Cond: ((id = ANY ('{10000,10004,1001,10005}'::integer[])) AND (datetime >= '2019-09-25 00:00:00'::timestamp without time zone) AND (datetime <= '2019-09-30 00:00:00'::timestamp without time zone))
Heap Blocks: exact=16352
-> Bitmap Index Scan on _hyper_1_3_chunk_table_id_datetime_idx (cost=0.00..1744.39 rows=88510 width=0) (actual time=17.291..17.291 rows=84852 loops=1)
Index Cond: ((id = ANY ('{10000,10004,1001,10005}'::integer[])) AND (datetime >= '2019-09-25 00:00:00'::timestamp without time zone) AND (datetime <= '2019-09-30 00:00:00'::timestamp without time zone))
-> Hash (cost=8.46..8.46 rows=346 width=33) (actual time=0.163..0.163 rows=346 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 31kB
-> Seq Scan on mapping (cost=0.00..8.46 rows=346 width=33) (actual time=0.019..0.097 rows=346 loops=1)
Planning time: 1.008 ms
Execution time: 11507.606 ms