sql-server - これらは効率の点で同等ですか

Question

次の 2 つのスクリプトの効率の違いに疑問を呈したことはありません。

DateKey は型ですINT

1.

DECLARE @StartDate  INT = 20130101,
        @EndDate    INT = 20130201

SELECT  UserAccountKey,
        income_LT = SUM(ISNULL(income,0.0)) 
INTO    #x
FROM    WH.dbo.xxx x
WHERE   x.DateKey > = @StartDate
        AND x.DateKey < @EndDate
GROUP BY    UserAccountKey

上記の実行は次のとおりです。

ここに画像の説明を入力

2.

SELECT  UserAccountKey,
        income_LT = SUM(ISNULL(income,0.0)) 
INTO    #x
FROM    WH.dbo.xxx x
WHERE   x.DateKey > = 20130101
        AND x.DateKey < 20130201
GROUP BY    UserAccountKey

番号 2 の実行計画は次のとおりです。ここに画像の説明を入力

1.はるかに高速です（80秒と比較して2秒）-これは予想どおりですか？なんで？

score 3 · Accepted Answer

On the first query it uses variables. The value of these is not known at compilation time so it produces a plan based on generic estimates. On the second one it compiles a plan based on the actual parameter values.

The fact that the generic guesses work out better than the plan where it knows the specific values indicates that probably your statistics need updating.

Likely last time they were updated few if any rows matched the WHERE DateKey > = 20130101 AND DateKey < 20130201 predicate but now many do.

This is a common issue with ascending date columns

See also this question and answers on the dba site

Edit This can be seen in the plan here

enter image description here

The thickness of the lines indicate the number of rows. The very thin line to the left of the compute scalar shows the estimated number of rows (actual row counts often aren't shown for compute scalars for the reasons here). The very thick lines into the compute scalar and out of the sort represent the actual number of rows. The two are clearly very different.

As well as choosing an inappropriate plan (serial with nested loops join) this poor estimate also means that the sort spilled to disc as the query requested an insufficient memory grant (shown by the warning triangle).

sql-server - これらは効率の点で同等ですか

1 に答える 1

Related

Reference