実行速度が遅すぎるクエリがあります。
select c.vm_name,
round(sum(bytes_sent)*1.8/power(10,9)) gb_sent,
round(sum(bytes_received)*1.8/power(10,9)) gb_received
from groups b,
vms c,
vm_ip_address_histories d,
ip_address_usage_histories e
where b.group_id = c.group_id
and c.vm_id = d.vm_id
and d.ip_address = e.ip_address
and e.datetime >= firstday()
and d.allocation_date <= last_day(sysdate()) and (d.deallocation_date is null or d.deallocation_date > last_day(sysdate()))
and b.customer_id = 29
group by c.vm_name
order by 1;
この関数sysdate()
は、タイム ゾーンなしで現在のシステム タイムスタンプをlast_day()
返し、月の最終日を表すタイムスタンプを返します。これらを作成したのは、Hibernate が Postgres のキャスト表記を好まないためです。
問題は、プランナーがインデックスが配置されている場所で全テーブル スキャンを実行していることです。上記のクエリの実行計画は次のとおりです。
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=1326387.13..1326391.38 rows=1698 width=24) (actual time=13221.041..13221.042 rows=7 loops=1)
Sort Key: c.vm_name
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=1326236.61..1326296.04 rows=1698 width=24) (actual time=13221.008..13221.026 rows=7 loops=1)
Group Key: c.vm_name
-> Hash Join (cost=1309056.97..1325972.10 rows=35268 width=24) (actual time=13131.323..13211.612 rows=13631 loops=1)
Hash Cond: (d.ip_address = e.ip_address)
-> Nested Loop (cost=2.97..6942.24 rows=79 width=15) (actual time=0.249..56.904 rows=192 loops=1)
-> Hash Join (cost=2.69..41.02 rows=98 width=16) (actual time=0.066..0.638 rows=61 loops=1)
Hash Cond: (c.group_id = b.group_id)
-> Seq Scan on vms c (cost=0.00..30.98 rows=1698 width=24) (actual time=0.009..0.281 rows=1698 loops=1)
-> Hash (cost=2.65..2.65 rows=3 width=8) (actual time=0.014..0.014 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on groups b (cost=0.00..2.65 rows=3 width=8) (actual time=0.004..0.011 rows=4 loops=1)
Filter: (customer_id = 29)
Rows Removed by Filter: 49
-> Index Scan using xif1vm_ip_address_histories on vm_ip_address_histories d (cost=0.29..70.34 rows=8 width=15) (actual time=0.011..0.921 rows=3 loops=61)
Index Cond: (vm_id = c.vm_id)
Filter: ((allocation_date <= last_day(sysdate())) AND ((deallocation_date IS NULL) OR (deallocation_date > last_day(sysdate()))))
Rows Removed by Filter: 84
-> Hash (cost=1280129.06..1280129.06 rows=1575435 width=23) (actual time=13130.223..13130.223 rows=203702 loops=1)
Buckets: 8192 Batches: 32 Memory Usage: 422kB
-> Seq Scan on ip_address_usage_histories e (cost=0.00..1280129.06 rows=1575435 width=23) (actual time=0.205..13002.776 rows=203702 loops=1)
Filter: (datetime >= firstday())
Rows Removed by Filter: 4522813
Planning time: 0.804 ms
Execution time: 13221.155 ms
(27 rows)
プランナは、最大のテーブルで非常にコストのかかる全テーブル スキャンを実行することを選択していることに注意してip_address_usage_histories
くださいvm_ip_address_histories
。構成パラメーターenable_seqscan
をオフに変更してみましたが、問題が悪化し、合計実行時間は 63 秒になりました。
前述のテーブルの説明は次のとおりです。
Table "ip_address_usage_histories"
Column | Type | Modifiers
-----------------------------+-----------------------------+-----------
ip_address_usage_history_id | bigint | not null
datetime | timestamp without time zone | not null
ip_address | inet | not null
bytes_sent | bigint | not null
bytes_received | bigint | not null
Indexes:
"ip_address_usage_histories_pkey" PRIMARY KEY, btree (ip_address_usage_history_id)
"ip_address_usage_histories_datetime_ip_address_key" UNIQUE CONSTRAINT, btree (datetime, ip_address)
"uk_mit6tbiu8k62vdae4tmtnwb3f" UNIQUE CONSTRAINT, btree (datetime, ip_address)
Table "vm_ip_address_histories"
Column | Type | Modifiers
--------------------------+-----------------------------+--------------------------------------------------------------------------------------------
vm_ip_address_history_id | bigint | not null default nextval('vm_ip_address_histories_vm_ip_address_history_id_seq'::regclass)
ip_address | inet | not null
allocation_date | timestamp without time zone | not null
deallocation_date | timestamp without time zone |
vm_id | bigint | not null
Indexes:
"vm_ip_address_histories_pkey" PRIMARY KEY, btree (vm_ip_address_history_id)
"xie1vm_ip_address_histories" btree (replicate_date)
"xif1vm_ip_address_histories" btree (vm_id)
Foreign-key constraints:
"vm_ip_address_histories_vm_id_fkey" FOREIGN KEY (vm_id) REFERENCES vms(vm_id) ON DELETE RESTRICT
Postgres には、プランナーに指示するためのクエリ ヒントがないようです。from 句の構文も試しましたinner join ... on ...
が、改善されませんでした。
更新 1
create or replace function firstday() returns timestamp without time zone as $$
begin
return date_trunc('month',now()::timestamp without time zone)::timestamp without time zone;
end; $$
language plpgsql;
Postgres には月の初日を返す関数がないため、この関数を標準関数に置き換えようとはしていません。