次のような2000互換モードで実行されているSQL2005サーバー上のデータでいっぱいのテーブルItemValueがあります(これはユーザー定義値テーブルです)。
ID ItemCode FieldID Value
-- ---------- ------- ------
1 abc123 1 D
2 abc123 2 287.23
4 xyz789 1 A
5 xyz789 2 3782.23
6 xyz789 3 23
7 mno456 1 W
9 mno456 3 45
... and so on.
FieldIDはItemFieldテーブルから取得されます。
ID FieldNumber DataFormatID Description ...
-- ----------- ------------ -----------
1 1 1 Weight class
2 2 4 Cost
3 3 3 Another made up description
. . x xxx
. . x xxx
. . x xxx
x 91 (we have 91 user-defined fields)
2000モードではPIVOTを実行できないため、CASEとGROUP BYを使用して醜いクエリを作成し、一部のレガシーアプリでどのようにデータを取得するかを確認します。
ItemNumber Field1 Field2 Field3 .... Field51
---------- ------ ------- ------
abc123 D 287.23 NULL
xyz789 A 3782.23 23
mno456 W NULL 45
このテーブルは、51番目のUDFまでの値を表示するためにのみ必要であることがわかります。クエリは次のとおりです。
SELECT
iv.ItemNumber,
,MAX(CASE WHEN f.FieldNumber = 1 THEN iv.[Value] ELSE NULL END) [Field1]
,MAX(CASE WHEN f.FieldNumber = 2 THEN iv.[Value] ELSE NULL END) [Field2]
,MAX(CASE WHEN f.FieldNumber = 3 THEN iv.[Value] ELSE NULL END) [Field3]
...
,MAX(CASE WHEN f.FieldNumber = 51 THEN iv.[Value] ELSE NULL END) [Field51]
FROM ItemField f
LEFT JOIN ItemValue iv ON f.ID = iv.FieldID
WHERE f.FieldNumber <= 51
GROUP BY iv.ItemNumber
FieldNumber制約が<=51の場合、実行計画は次のようになります。
SELECT <== Computer Scalar <== Stream Aggregate <== Sort (Cost: 70%) <== Hash Match <== (Clustered Index Seek && Table Scan)
そしてそれは速いです!約1秒で100,000以上のレコードを取り戻すことができます。これは、私たちのニーズに合っています。
However, if we had more UDFs and I change the constraint to anything above 66 (yes, I tested them one by one) or if I remove it completely, I lose the Sort in the Execution plan, and it gets replaced with a whole bunch of Parallelism blocks that gather, repartition, and distribute streams, and the entire thing is slow (30 seconds for even just 1 record).
FieldNumber has a clustered, unique index, and is part of composite primary key with the ID column (non-clustered index) in the ItemField table. The ItemValue table's ID and ItemNumber columns make a PK, and there is an extra non-clustered index on the ItemNumber column.
What is the reasoning behind this? Why does changing my simple integer constraint change the entire execution plan?
And if you're up to it... what would you do differently? There's a SQL upgrade planned for a couple months from now but I need to get this problem fixed before that.