sql - SQLクエリで次の行を見つけ、前の行が一致する場合にのみ削除する

Question

このようなテーブルがあります。

|-DT--------- |-ID------|
|5/30 12:00pm |10       |
|5/30 01:00pm |30       |
|5/30 02:30pm |30       |
|5/30 03:00pm |50       |
|5/30 04:30pm |10       |
|5/30 05:00pm |10       |
|5/30 06:30pm |10       |
|5/30 07:30pm |10       |
|5/30 08:00pm |50       |
|5/30 09:30pm |10       |

前の行が次の行と同じ ID を持つ場合にのみ、重複する行を削除したいと考えています。将来の日時が最も遠い重複行を保持したい。たとえば、上の表は次のようになります。

|-DT--------- |-ID------|
|5/30 12:00pm |10       |
|5/30 02:30pm |30       |
|5/30 03:00pm |50       |
|5/30 07:30pm |10       |
|5/30 08:00pm |50       |
|5/30 09:30pm |10       |

これを行う方法について何かヒントを得ることができますか?

score 3 · Accepted Answer

with C as
(
  select ID,
         row_number() over(order by DT) as rn
  from YourTable
)
delete C1
from C as C1
  inner join C as C2
    on C1.rn = C2.rn-1 and
       C1.ID = C2.ID

SE-データ

score 2 · Accepted Answer

次の 3 つの手順を実行します: http://www.sqlfiddle.com/#!3/b58b9/19

最初に行を連続させます。

with a as
(
  select dt, id, row_number() over(order by dt) as rn
  from tbl
)
select * from a;

出力：

|                         DT | ID | RN |
----------------------------------------
| May, 30 2012 12:00:00-0700 | 10 |  1 |
| May, 30 2012 13:00:00-0700 | 30 |  2 |
| May, 30 2012 14:30:00-0700 | 30 |  3 |
| May, 30 2012 15:00:00-0700 | 50 |  4 |
| May, 30 2012 16:30:00-0700 | 10 |  5 |
| May, 30 2012 17:00:00-0700 | 10 |  6 |
| May, 30 2012 18:30:00-0700 | 10 |  7 |
| May, 30 2012 19:30:00-0700 | 10 |  8 |
| May, 30 2012 20:00:00-0700 | 50 |  9 |
| May, 30 2012 21:30:00-0700 | 10 | 10 |

次に、連続した番号を使用して、どの行が一番下にあるか (また、一番下にない行も) を見つけることができます。

with a as
(
  select dt, id, row_number() over(order by dt) as rn
  from tbl
)
select below.*, 
    case when above.id <> below.id or above.id is null then 
        1 
    else 
        0 
    end as is_at_bottom
from a below
left join a above on above.rn + 1 = below.rn;

出力：

|                         DT | ID | RN | IS_AT_BOTTOM |
-------------------------------------------------------
| May, 30 2012 12:00:00-0700 | 10 |  1 |            1 |
| May, 30 2012 13:00:00-0700 | 30 |  2 |            1 |
| May, 30 2012 14:30:00-0700 | 30 |  3 |            0 |
| May, 30 2012 15:00:00-0700 | 50 |  4 |            1 |
| May, 30 2012 16:30:00-0700 | 10 |  5 |            1 |
| May, 30 2012 17:00:00-0700 | 10 |  6 |            0 |
| May, 30 2012 18:30:00-0700 | 10 |  7 |            0 |
| May, 30 2012 19:30:00-0700 | 10 |  8 |            0 |
| May, 30 2012 20:00:00-0700 | 50 |  9 |            1 |
| May, 30 2012 21:30:00-0700 | 10 | 10 |            1 |

3 番目に、一番下以外のすべての行を削除します。

with a as
(
  select dt, id, row_number() over(order by dt) as rn
  from tbl
)
,b as 
(
  select below.*, 
       case when above.id <> below.id or above.id is null then 
           1 
       else 
           0 
       end as is_at_bottom
  from a below
  left join a above on above.rn + 1 = below.rn
)
delete a
from a
inner join b on b.rn = a.rn
where b.is_at_bottom = 0;

検証します：

select * from tbl order by dt;

出力：

|                         DT | ID |
-----------------------------------
| May, 30 2012 12:00:00-0700 | 10 |
| May, 30 2012 13:00:00-0700 | 30 |
| May, 30 2012 15:00:00-0700 | 50 |
| May, 30 2012 16:30:00-0700 | 10 |
| May, 30 2012 20:00:00-0700 | 50 |
| May, 30 2012 21:30:00-0700 | 10 |

これに削除を単純化することもできます: http://www.sqlfiddle.com/#!3/b58b9/20

with a as
(
  select dt, id, row_number() over(order by dt, id) as rn
  from tbl
)
delete above
from a below
left join a above on above.rn + 1 = below.rn
where case when above.id <> below.id or above.id is null then 1 else 0 end = 0;

Mikael Eriksson の回答がベストですが、簡略化したクエリをもう一度簡略化すると、彼の回答のようになりますツそのために、私は彼の回答を +1 しました。ただし、彼のクエリをもう少し読みやすくします。結合順序を交換し、適切なエイリアスを与えることによって。

with a as
(
  select *, row_number() over(order by dt, id) as rn
  from tbl
)
delete above

from a below
join a above on above.rn + 1 = below.rn and above.id = below.id;

ライブテスト: http://www.sqlfiddle.com/#!3/b58b9/24

score 0 · Accepted Answer

ここで、[Table]をテーブルの名前に置き換えるだけです。

SELECT * 
FROM [dbo].[Table]
WHERE [Ident] NOT IN 
(
    SELECT Extent.[Ident]
    FROM 
    (
        SELECT  TOP 100 PERCENT T1.[DT], 
                T1.[ID],
                T1.[Ident],
                (
                    SELECT TOP 1 Previous.ID
                    FROM [dbo].[Table] AS Previous
                    WHERE Previous.[Ident] > T1.Ident -- this is where the identity seed is important
                    ORDER BY [Ident] ASC
                ) AS 'PreviousId'
        FROM [dbo].[Table] AS T1
        ORDER BY T1.[Ident] DESC
    ) AS Extent
    WHERE [Id] = [PreviousId]
)

注：テーブルにはインデント列が必要です。テーブルの構造を変更できない場合は、CTEを使用してください。

score 0 · Accepted Answer

次のクエリを試すことができます...

select * from 
(
    select *,RANK() OVER (ORDER BY dt,id) AS Rank  from test
) as a
where 0 = (
select count(id) from (
select id, RANK() OVER (ORDER BY dt,id) AS Rank  from test
)as b where b.id = a.id and b.Rank = a.Rank + 1 

) order by dt

ありがとう、マヘシュ

sql - SQLクエリで次の行を見つけ、前の行が一致する場合にのみ削除する

4 に答える 4

Related

Reference