sql - 注文の合計が1000になる注文数を取得する

Question

Orders のテーブルがあり、それらの各行には price という列があります。これらの各注文には、その注文がいつ作成されたかを示す列もありますcreated_at。

価格の合計額が $1000 を超える注文を見つけるには、どのような方法がよいでしょうか?

したがって、次のような 3 つの注文があるとします。

Order 1: price: $800 - created_at: 2013/07/11 

Order 2: price: $100 - created_at: 2013/07/13 

Order 3: price: $300 - created_at: 2013/07/14

$800 + $100 + $300 を追加すると、合計金額が $1000 を超えたのはまさに $300 であるため、注文 3 が $1000 を超えたものであることに興味があります。

それを見つけるためにどのようなクエリを実行できますか?

score 0 · Accepted Answer

このために、Postgres がウィンドウ関数として提供する累積合計が必要です。

select o.*
from (select o.*,
             sum(o2.price) over (order by created_at) as cumsum
      from orders o
     ) o
where 1000 > cumsum - price and 1000 <= cumsum;

このwhere句は、追加した価格が最初に $1000 を超えた行に罰金を科すだけです。

score 0 · Accepted Answer

After calculating a running sum with the window aggregate function sum(), just pick the first row according to created_at that exceeds 1000:

SELECT *
FROM (
   SELECT order_id, created_at
        , sum(price) OVER (ORDER BY created_at) AS sum_price
   FROM   orders
   ) sub
WHERE  sum_price >= 1000
ORDER  BY created_at 
LIMIT  1;

This should be faster than @Gordon's version, because picking the first according to the same order that's already used in the window function is a lot cheaper than calculating a value for every row, which is not sargable.

I use sum_price >= 1000, so reaching 1000 exactly qualifies, too. If only exceeding should qualify use > instead of >=.

The manual on window functions informs:

In addition to these functions, any built-in or user-defined aggregate function can be used as a window function

It should be noted, that this query always delivers exactly one row, as opposed to @Gordon's query. In a case where multiple rows with identical created_at cross the 1000 barrier, all of them would qualify in Gordon's answer (or it would fail, see below), while only one is picked in mine. It will be an arbitrary one, as long you don't add more items to ORDER BY as tiebreaker. Like:

ORDER BY created_at, order_id

There are two instances of ORDER BY in this query, and it just so happens that you could modify either or both to make it work. Do it for both to make the sort order identical, this should be fastest.

Actually, Gordon's version would fail completely for this test case:

CREATE TEMP TABLE orders(order_id int, price int, created_at date);

INSERT INTO orders VALUES
  (1, 500, '2013-07-01')
 ,(2, 400, '2013-07-02')
 ,(3, 100, '2013-07-03')
 ,(4, 100, '2013-07-03')
 ,(5, 100, '2013-07-03');

You could fix it by making the sort order in the window function unique like demonstrated above.

Or you could change the frame definition for the window function to:

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

Read the fine print in the manual.

But it's slower either way.

-> SQLfiddle

sql - 注文の合計が1000になる注文数を取得する

2 に答える 2

Related

Reference