sql - postgresクエリの最適化

Question

                                 QUERY PLAN                                   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=32164.87..32164.89 rows=1 width=44) (actual time=221552.831..221552.831 rows=0 loops=1)
   ->  Sort  (cost=32164.87..32164.87 rows=1 width=44) (actual time=221552.827..221552.827 rows=0 loops=1)
         Sort Key: t.date_effective, t.acct_account_transaction_id, p.method, t.amount, c.business_name, t.amount
         ->  Nested Loop  (cost=22871.67..32164.86 rows=1 width=44) (actual time=221552.808..221552.808 rows=0 loops=1)
               ->  Nested Loop  (cost=22871.67..32160.37 rows=1 width=52) (actual time=221431.071..221546.619 rows=670 loops=1)
                     ->  Nested Loop  (cost=22871.67..32157.33 rows=1 width=43) (actual time=221421.218..221525.056 rows=2571 loops=1)
                           ->  Hash Join  (cost=22871.67..32152.80 rows=1 width=16) (actual time=221307.382..221491.019 rows=2593 loops=1)
                                 Hash Cond: ("outer".acct_account_id = "inner".acct_account_fk)
                                 ->  Seq Scan on acct_account a  (cost=0.00..7456.08 rows=365008 width=8) (actual time=0.032..118.369 rows=61295 loops=1)
                                 ->  Hash  (cost=22871.67..22871.67 rows=1 width=16) (actual time=221286.733..221286.733 rows=2593 loops=1)
                                       ->  Nested Loop Left Join  (cost=0.00..22871.67 rows=1 width=16) (actual time=1025.396..221266.357 rows=2593 loops=1)
                                             Join Filter: ("inner".orig_acct_payment_fk = "outer".acct_account_transaction_id)
                                             Filter: ("inner".link_type IS NULL)
                                             ->  Seq Scan on acct_account_transaction t  (cost=0.00..18222.98 rows=1 width=16) (actual time=949.081..976.432 rows=2596 loops=1)
                                                   Filter: ((("type")::text = 'debit'::text) AND ((transaction_status)::text = 'active'::text) AND (date_effective >= '2012-03-01'::date) AND (date_effective < '2012-04-01 00:00:00'::timestamp without time zone))
                                             ->  Seq Scan on acct_payment_link l  (cost=0.00..4648.68 rows=1 width=15) (actual time=1.073..84.610 rows=169 loops=2596)
                                                   Filter: ((link_type)::text ~~ 'return_%'::text)
                           ->  Index Scan using contact_pk on contact c  (cost=0.00..4.52 rows=1 width=27) (actual time=0.007..0.008 rows=1 loops=2593)
                                 Index Cond: (c.contact_id = "outer".contact_fk)
                     ->  Index Scan using acct_payment_transaction_fk on acct_payment p  (cost=0.00..3.02 rows=1 width=13) (actual time=0.005..0.005 rows=0 loops=2571)
                           Index Cond: (p.acct_account_transaction_fk = "outer".acct_account_transaction_id)
                           Filter: ((method)::text <> 'trade'::text)
               ->  Index Scan using contact_role_pk on contact_role  (cost=0.00..4.48 rows=1 width=4) (actual time=0.007..0.007 rows=0 loops=670)
                     Index Cond: ("outer".contact_id = contact_role.contact_fk)
                     Filter: (exchange_fk = 74)
Total runtime: 221553.019 ms

score 4 · Accepted Answer

あなたの問題はここにあります：

->  Nested Loop Left Join  (cost=0.00..22871.67 rows=1 width=16) (actual time=1025.396..221266.357 rows=2593 loops=1)
    Join Filter: ("inner".orig_acct_payment_fk = "outer".acct_account_transaction_id)
    Filter: ("inner".link_type IS NULL)
        ->  Seq Scan on acct_account_transaction t  (cost=0.00..18222.98 rows=1 width=16) (actual time=949.081..976.432 rows=2596 loops=1)
                Filter: ((("type")::text = 'debit'::text) AND ((transaction_status)::text = 'active'::text) AND (date_effective >= '2012-03-01'::date) AND (date_effective   
            Seq Scan on acct_payment_link l  (cost=0.00..4648.68 rows=1 width=15) (actual time=1.073..84.610 rows=169 loops=2596)
                Filter: ((link_type)::text ~~ 'return_%'::text)

acct_account_transactionで1行が検出され、2596が検出され、他のテーブルでも同様であることが期待されます。

あなたはあなたのpostgresバージョンについて言及しませんでした（あなたはできますか？）、しかしこれはトリックをするべきです：

SELECT DISTINCT
    t.date_effective,
    t.acct_account_transaction_id,
    p.method,
    t.amount,
    c.business_name,
    t.amount
FROM
    contact c inner join contact_role on (c.contact_id=contact_role.contact_fk and contact_role.exchange_fk=74),
    acct_account a, acct_payment p,
    acct_account_transaction t
WHERE
    p.acct_account_transaction_fk=t.acct_account_transaction_id
    and t.type = 'debit'
    and transaction_status = 'active'
    and p.method != 'trade'
    and t.date_effective >= '2012-03-01'
    and t.date_effective < (date '2012-03-01' + interval '1 month')
    and c.contact_id=a.contact_fk and a.acct_account_id = t.acct_account_fk
    and not exists(
         select * from acct_payment_link l 
           where orig_acct_payment_fk == acct_account_transaction_id 
           and link_type like 'return_%'
    )
ORDER BY
    t.date_effective DESC

また、関連する列に適切な統計ターゲットを設定してみてください。フレンドリーなマニュアルへのリンク：http ：//www.postgresql.org/docs/current/static/sql-altertable.html

score 0 · Accepted Answer

クエリの性質が変わるため、最初の提案を削除します。

に費やす時間が多すぎることがわかりますLEFT JOIN。

まず、acct_payment_linkテーブルを1回だけスキャンしようとします。クエリを次のように書き直してみてください。
```
... LEFT JOIN (SELECT * FROM acct_payment_link
               WHERE link_type LIKE 'return_%') AS l ...
```
計画された行数と返された行数には違いがあるため、統計を確認する必要があります。
テーブルとインデックスの定義を含めていないので、それらを調べてみるとよいでしょう。
contrib/pg_tgrm拡張機能を使用してにインデックスを作成することもacct_payment_link.link_typeできますが、これを試してみる最後のオプションにします。

ところで、使用しているPostgreSQLのバージョンは何ですか？

score 0 · Accepted Answer

あなたのインデックスは何ですか、そしてあなたは最近分析しましたか？acct_account_transactionそのテーブルにはいくつかの基準がありますが、テーブルスキャンを実行しています。

タイプ
date_effective

これらの列にインデックスがない場合は、複合インデックス(type, date_effective)が役立つ可能性があります（これらの列の基準を満たさない行がたくさんあると仮定します）。

score 0 · Accepted Answer

ステートメントを書き直してフォーマットしました：

SELECT DISTINCT
       t.date_effective,
       t.acct_account_transaction_id,
       p.method,
       t.amount,
       c.business_name,
       t.amount
FROM   contact                  c
JOIN   contact_role            cr ON cr.contact_fk = c.contact_id
JOIN   acct_account             a ON a.contact_fk = c.contact_id 
JOIN   acct_account_transaction t ON t.acct_account_fk = a.acct_account_id 
JOIN   acct_payment             p ON p.acct_account_transaction_fk
                                   = t.acct_account_transaction_id
LEFT   JOIN acct_payment_link   l ON orig_acct_payment_fk
                                   = acct_account_transaction_id
                                        -- missing table-qualification!
                                 AND link_type like 'return_%'
                                        -- missing table-qualification!
WHERE  transaction_status = 'active'    -- missing table-qualification!
AND    cr.exchange_fk = 74
AND    t.type = 'debit'
AND    t.date_effective >= '2012-03-01'
AND    t.date_effective <  (date '2012-03-01' + interval '1 month')
AND    p.method != 'trade'
AND    l.link_type IS NULL
ORDER  BY t.date_effective DESC;

明示的なJOINステートメントが推奨されます。JOINロジックに従ってテーブルを並べ替えました。
なぜ(date '2012-03-01' + interval '1 month')日付の代わりに'2012-04-01'？
一部のテーブル資格が欠落しています。このような複雑なステートメントでは、それは悪いスタイルです。間違いを隠している可能性があります。

パフォーマンスの鍵は、適切な場合のインデックス、PostgreSQLの適切な構成、および正確な統計です。

PostgreSQLwikiでのパフォーマンスチューニングに関する一般的なアドバイス。

sql - postgresクエリの最適化

4 に答える 4

Related

Reference