sql - ROW_NUMBER() OVER 関数を使用せずにパーティション内の行の連続番号 (ランク) を取得する

Question

ソーステーブルが次の場合、パーティション (またはグループ) ごとに行をランク付けする必要があります。

NAME PRICE
---- -----
AAA  1.59
AAA  2.00
AAA  0.75
BBB  3.48
BBB  2.19
BBB  0.99
BBB  2.50

ターゲットテーブルを取得したい:

RANK NAME PRICE
---- ---- -----
1    AAA  0.75
2    AAA  1.59
3    AAA  2.00
1    BBB  0.99
2    BBB  2.19
3    BBB  2.50
4    BBB  3.48

通常はROW_NUMBER() OVER関数を使用するため、Apache Hive では次のようになります。

select
  row_number() over (partition by NAME order by PRICE) as RANK,
  NAME,
  PRICE
from
  MY_TABLE
;

残念ながら、Cloudera Impala は (現時点では)ROW_NUMBER() OVER関数をサポートしていないため、回避策を探しています。UDAF をサーバーに展開するよう説得するのは政治的に難しいため、UDAF を使用しないことをお勧めします。

score 3 · Accepted Answer

相関サブクエリで実行できない場合でも、結合でこれを実行できます。

select t1.name, t1.price,
       coalesce(count(t2.name) + 1, 1)
from my_table t1 join
     my_table t2
     on t2.name = t1.name and
        t2.price < t1.price
order by t1.name, t1.price;

すべての価格が特定のに対して異なる場合をrow_number() 除き、これは正確には機能しないことに注意してくださいname。この定式化は、実際にはと同等rank()です。

の場合row_number()、一意の行識別子が必要です。

ちなみに、以下はと同等dense_rank()です。

select t1.name, t1.price,
       coalesce(count(distinct t2.name) + 1, 1)
from my_table t1 join
     my_table t2
     on t2.name = t1.name and
        t2.price < t1.price
order by t1.name, t1.price;

score 2 · Accepted Answer

ウィンドウ機能をサポートしていないシステムの通常の回避策は、次のようなものです。

select name, 
       price,
       (select count(*) 
        from my_table t2 
        where t2.name = t1.name  -- this is the "partition by" replacement
        and t2.price < t1.price) as row_number
from my_table t1
order by name, price;

SQLFiddle の例: http://sqlfiddle.com/#!2/3b027/2

sql - ROW_NUMBER() OVER 関数を使用せずにパーティション内の行の連続番号 (ランク) を取得する

3 に答える 3

Related

Reference