1

I consider myself rather proficient with T-SQL and I'm usually able to optimize a query pretty good without loosing readability. In short: I like my SQL short, descriptive, declarative and elegant.

While the following code works, i have two problems with it:

  1. I am using cursors and I can't shake the feeling I have in the back of my head that it could have been done more efficiently using CTEs. Also cursors don't work in views, so I can't manipulate results/ranges on the client-side or in dependent SQL.
  2. The code is implemented in a Stored Procedure, which leads to the same problem as above. Especially with LInQ to SQL and auto-paging.

So given the following SP, does anyone see any obvious way to convert this to a plain select using recursive CTEs? I've tried, failed and thought I'd see what the stack overflow community might be able to come up with.

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE PROC [dbo].[usp_GetLastReferers]
(
 @Limit int = NULL
)
AS
BEGIN
 SET NOCOUNT ON

 CREATE TABLE #Referer
 (
  ID   int,
  Url   nvarchar(500),
  Referer  nvarchar(500)
 )

 DECLARE @ID   int
 DECLARE @Url  nvarchar(500)
 DECLARE @Referer nvarchar(500)
 DECLARE @Count  int
 SET @Count = 0

 DECLARE LogCursor CURSOR FORWARD_ONLY READ_ONLY FOR
 SELECT ID, Url, Referer FROM Log WHERE Referer <> '' ORDER BY ID DESC

 OPEN LogCursor

 FETCH NEXT FROM LogCursor INTO @ID, @Url, @Referer

 WHILE @@FETCH_STATUS = 0 AND (@Count < @Limit OR @Limit IS NULL)
 BEGIN
  DECLARE @Hits int
  SELECT @Hits = COUNT(*)
  FROM #Referer
  WHERE Referer = @Referer

  DECLARE @IsLocal bit
  SELECT @IsLocal = dbo.IsLocalSite(@Referer)

  IF (@Hits = 0 OR @Hits IS NULL) AND @IsLocal = 0
  BEGIN
   INSERT INTO #Referer(ID,Url,Referer) VALUES (@ID,@Url,@Referer)
   SET @Count = @Count + 1
  END

  FETCH NEXT FROM LogCursor INTO @ID, @Url, @Referer  
 END

 CLOSE LogCursor
 DEALLOCATE LogCursor

 SELECT *
 FROM #Referer

 DROP TABLE #Referer

 SET NOCOUNT OFF
END

Since it may not be totally obvious, what I'm trying to do here is akin tothe following quasi SQL

SELECT DISTINCT TOP(@Limit) ID, Url, Referer
FROM Log 
ORDER BY ID DESC

Basically to get the last unique refers (not unique rows), which often contain duplicates, and in descending order. This is definitely where it gets tricky.

The data is pretty simple HTTP logs. ID field is just a unique row-identifier, Url is the full url requesten, and Referer is the HTTP referer for that request. None of the values can be null, but referer can be empty (ie ''). IsSiteLocal is just a simple filtering function to exclude referers originating from my own sites.

If anyone wants some sample-data to full around, I can upload a small DB-backup so you have something to fool around with.

Sample-data can be found here: http://svada.kjonigsen.net/files/IISLogsDBBackup.zip

4

3 に答える 3

1

なぜ再帰 CTE に変換するのですか? プレーンな選択として実行できない理由はありません。

dbo.IsLocalSite 関数が欠落しているテスト データベースをダウンロードしたので、テスト用に独自の同じ名前の関数を作成し、常に 0 を返すと想定しました。

このコードを実行すると、上記のストアド プロシージャとまったく同じ結果が得られました。

SELECT TOP (@Limit) ID, Url, Referer
FROM (
    SELECT ID, Url, Referer, RANK() OVER (PARTITION BY Referer ORDER BY ID DESC) _RANK_
    FROM LOG
    WHERE dbo.IsLocalSite(Referer) = 0
    AND Referer != ''
) TT
WHERE _RANK_ = 1
ORDER BY ID DESC;
于 2012-11-30T13:45:12.403 に答える
0

これを試して:

 ;with Referers as (
  SELECT 
  row_number() over (order by id desc) rn 
  ,ID, Url, Referer FROM Log 
  WHERE dbo.IsLocalSite(Referer) = 0 
) 
select * from Referers
where rn <= @limit
于 2010-09-22T11:47:59.017 に答える
0

dbo.IsLocalSite(@Referer) = 0 である限り、各 Url,Referer の max(ID) が必要ですか? Url,Referer でグループ化して max(ID) を取得し、WHERE 句で関数を適用できますか?

于 2010-09-22T21:41:31.180 に答える