sql - 動的SQL（SQLサーバー）なしで達成しようとしています

Question

全て、

動的SQLを使用せずに、あるテーブルから別のテーブルへの挿入を実行しようとしています。ただし、現時点で私が考えている唯一のソリューションは動的SQLを使用しています。同様のシナリオを検索するのは困難です。

詳細は次のとおりです。

私の出発点は、次の従来のテーブルです。

CREATE TABLE [dbo].[_Combinations](
[AttributeID] [int] NULL,
[Value] [varchar](50) NULL
) ON [PRIMARY]
GO
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (16, N'1')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (16, N'2')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'Red')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'Orange')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'Yellow')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'Green')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'Blue')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'Indigo')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'Violet')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'A')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'B')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'C')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'D')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'E')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'F')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'G')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'H')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'I')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'J')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'K')

SELECT * FROM _Combinations

_Combinations テーブルには、さまざまなタイプの属性のキー (AttributeID) と、各属性の可能な値 (Value) が含まれています。

この場合、複数の可能な値を持つ 3 つの異なる属性がありますが、さらに多く (最大 10) 存在する可能性があります。

次に、各値の可能なすべての組み合わせを作成し、それを正規化して保存する必要があります。これは、可能な組み合わせごとに他のデータが保存されるためです。各組み合わせを構成する属性キーと値の両方を格納する必要があるため、各組み合わせを表示するための単なるクロス結合ではありません。属性の各組み合わせを格納するターゲットテーブルは次のとおりです。

CREATE TABLE [dbo].[_CombinedAttributes](
[GroupKey] [int] NULL,
[AttributeID] [int] NULL,
[Value] [varchar](50) NULL
) ON [PRIMARY]

したがって、上記のデータを使用した属性の組み合わせレコードは、ターゲットテーブルでは次のようになります。

GroupKey    AttributeID Value
1               8         A
1               16        1
1               28        Red
2               8         B
2               16        1
2               28        Red

これにより、必要なものが得られます。各グループには識別子があり、各グループを構成する属性 ID と値を追跡できます。2 つのスクリプトを使用して、_Combinations テーブルから _CombinedAttributes テーブルの形式を取得しています。

-- SCRIPT #1
SELECT Identity(int) AS RowNumber, * INTO #Test
FROM (
SELECT AttributeID AS Attribute1, Value AS Value1 FROM _Combinations WHERE AttributeID = 8) C1
CROSS JOIN 
(
SELECT AttributeID AS Attribute2, Value AS Value2 FROM _Combinations WHERE AttributeID = 16) C2
CROSS JOIN
(
SELECT AttributeID AS Attribute3, Value AS Value3 FROM _Combinations WHERE AttributeID = 28) C3

-- SCRIPT #2

INSERT INTO _CombinedAttributes
SELECT RowNumber AS GroupKey, Attribute1, Value1 
FROM #Test
UNION ALL
SELECT RowNumber, Attribute2, Value2 
FROM #Test
UNION ALL
SELECT RowNumber, Attribute3, Value3
FROM #Test
ORDER BY RowNumber, Attribute1

上記の 2 つのスクリプトは機能しますが、明らかにいくつかの欠点があります。つまり、扱っている属性の数を知る必要があり、ID がハードコーディングされているため、これをその場で生成することはできません。私が思いついた解決策は、_Combinations テーブルの属性をループしてスクリプト 1 とスクリプト 2 の文字列を作成し、長くて面倒な実行文字列を生成することですが、必要に応じて投稿できます。動的SQLなしで最終的な挿入のフォーマットを引き出す方法を誰かが見ることができますか?

このルーチンはあまり実行されませんが、文字列の構築を実行せずに直接 SQL を使用したいほど十分に実行されます。

前もって感謝します。

アップデート：

2 番目のデータセットを使用すると、Gordon のコードは正しい結果を返さなくなりました。最後に 1 つの属性しかないグループを作成していますが、この 2 番目のデータセットでは、Nathan のルーチンで正しい行数を取得します (最終結果の行数は 396 にする必要があります)。 . しかし、コメントで述べたように、最初のデータセットを使用すると、反対の結果が得られます。ゴードンは正しく返されますが、ネイサンのコードには重複があります。私は途方に暮れています。2 番目のデータセットは次のとおりです。

DROP TABLE [dbo].[_Combinations] GO

CREATE TABLE [dbo].[_Combinations]( [AttributeID] [int] NULL, [Value] varchar NULL ) ON [PRIMARY] GO

INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (16, N'1')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (16, N'2')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'<=39')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'40-44')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'45-49')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'50-54')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'55-64')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (28, N'65+')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'AA')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'JJ')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'CC')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'DD')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'EE')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'KK')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'BB')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'FF')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'GG')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'HH')
INSERT [dbo].[_Combinations] ([AttributeID], [Value]) VALUES (8, N'II')

score 6 · Accepted Answer

これで問題は解決すると思います。

これがアプローチです。最初に、最終データが各属性の数の積、つまり 2*7*11 = 154 行になることを確認します。次に、各値が一定回数発生することを確認します。AttributeId = 16 の場合、2 つの値があるため、各値は 154 / 2 で発生します。

したがって、アイデアは、各値が表示される回数を計算することです。次に、すべての値のリストを生成します。最後の課題は、これらにグループ番号を割り当てることです。このためにrow_number()、属性 ID で分割されたものを使用します。正直なところ、グループ化の割り当てが 100% 正しいとは言えません (それは理にかなっており、眼球テストに合格しました)。

クエリは次のとおりです。

with attributecount1 as (
      select c.AttributeId, count(*) as cnt
      from _Combinations c
      group by c.AttributeId
     ),
     const as (
      select exp(sum(log(cnt))) as tot, count(*) as numattr
      from attributecount1
     ),
     attributecount as (
       select a.*,
              (tot / a.cnt) as numtimes
       from attributecount1 a cross join const
     ),
     thevalues as (
      select c.AttributeId, c.Value, ac.numtimes, 1 as seqnum
      from AttributeCount ac join
           _Combinations c
           on ac.AttributeId = c.AttributeId
      union all
      select v.AttributeId, v.Value, v.numtimes, v.seqnum + 1
      from thevalues v
      where v.seqnum + 1 <= v.numtimes
     )
select row_number() over (partition by AttributeId order by seqnum, Value) as groupnum,
      *
from thevalues
order by 1, 2

SQL フィドルはこちらです。

編集：

残念ながら、今日は SQL Server にアクセスできず、SQL Fiddle が動作していません。

問題は解決可能です。上記の解決策は機能しますが、私のコメントで述べたように、次元がペアワイズで相互に素数である場合のみです。問題は、値へのグループ番号の割り当てです。これは数論の問題であることがわかりました。

基本的に、組み合わせを列挙したいと考えています。2 つのグループに 2 人いた場合、次のようになります。

group 0:  1    1
group 1:  1    2
group 2:  2    1
group 3:  2    2

グループ番号のバイナリ表現に基づいて、グループ番号と割り当てられる値との関係を確認できます。これが 2x3 の場合、次のようになります。

group 0:  1    1
group 1:  1    2
group 2:  1    3
group 3:  2    1
group 4:  2    2
group 5:  2    3

同じ考えですが、「バイナリ」表現はありません。数の各位置は、異なる基数になります。問題ない。

したがって、課題は、番号 (グループ番号など) を各桁にマッピングすることです。これには、適切な除算とモジュロ演算が必要です。

以下は、Postgres でこれを実装します。

with c as (
      select 1 as attrid, '1' as val union all
      select 1 as attrid, '2' as val union all
      select 2 as attrid, 'A' as val union all
      select 2 as attrid, 'B' as val union all
      select 3 as attrid, '10' as val union all
      select 3 as attrid, '20' as val 
     ),
     c1 as (
       select c.*, dense_rank() over (order by attrid) as attrnum,
              dense_rank() over (partition by attrid order by val) as valnum,
              count(*) over (partition by attrid) as cnt
       from c
     ),
     a1 as (
       select attrid, count(*) as cnt,
              cast(round(exp(sum(ln(count(*))) over (order by attrid rows between unbounded preceding and current row))) as int)/count(*) as cum
       from c
       group by attrid
     ),
     a2 as (
       select a.*,
              (select cast(round(exp(sum(ln(cnt)))) as int)
               from a1
               where a1.attrid <= a.attrid
              ) / cnt as cum
       from a1 a
     ),
     const as (
       select cast(round(exp(sum(ln(cnt)))) as int) as numrows
       from a1
     ),
     nums as (
       select 1 as n union all select 2 union all select 3 union all select 4 union all
       select 5 union all select 6 union all select 7 union all select 8
       from const
     ),
     ac as (
      select c1.*, a1.cum, const.numrows
      from c1 join
           a1 on c1.attrid = a1.attrid cross join
           const
     )
select *
from nums join
     ac
     on (nums.n/cum) % cnt = valnum - 1
order by 1, 2;

(注: generate_series() は、特定の結合で何らかの理由で正しく機能していませんでした。これが、一連の数字を手動で生成する理由です。)

SQL Fiddle が再び機能するようになったら、これを SQL Server に変換できるはずです。

編集II：

SQL Server で動作するバージョンは次のとおりです。

with attributecount1 as (
      select c.AttributeId, count(*) as cnt
      from _Combinations c
      group by c.AttributeId
     ),
     const as (
      select cast(round(exp(sum(log(cnt))), 1) as int) as tot, count(*) as numattr
      from attributecount1
     ),
     attributecount as (
       select a.*,
              (tot / a.cnt) as numtimes,
              (select cast(round(exp(sum(log(ac1.cnt))), 1) as int)
               from attributecount1 ac1
               where ac1.AttributeId <= a.AttributeId
              ) / a.cnt as cum
       from attributecount1 a cross join const
     ),
     c as (
       select c.*, ac.numtimes, ac.cum, ac.cnt,
              dense_rank() over (order by c.AttributeId) as attrnum,
              dense_rank() over (partition by c.AttributeId order by Value) as valnum
       from _Combinations c join
            AttributeCount ac
            on ac.AttributeId = c.AttributeId
     ),
     nums as (
       select 1 as n union all
       select 1 + n
       from nums cross join const
       where 1 + n <= const.tot
     )
select *
from nums join
     c
     on (nums.n / c.cum)%c.cnt = c.valnum - 1
option (MAXRECURSION 1000)

The SQL Fiddle はこちらです。

score 1 · Accepted Answer

何年も前に、あなたのものと同じように固定された EAV スキーマで同様の問題に直面しました。Peter Larssonは、私の「動的な組み合わせ」クエリに対処するために、以下のソリューションを考え出しました。

あなたのスキーマに合うように調整しました。お役に立てれば！

ここでSqlFiddle

;with cteSource (Iteration, AttributeID, recID, Items, Unq, Perm) as 
(   
    select  v.Number + 1,
            s.AttributeId,
            row_number() over (order by v.Number, s.AttributeID) - 1,
            s.Items,
            u.Unq,
            f.Perm
    from    (select AttributeID, count(*) from  _Combinations group by AttributeID) s(AttributeId, Items)
    cross 
    join    (select count(distinct AttributeID) from _Combinations) u (Unq)
    join    master..spt_values as v on v.Type = 'P'
    outer 
    apply   (
                select  top(1) cast(exp(sum(log(count(*))) over ()) as bigint)
                from    _Combinations as w
                where   w.AttributeID >= s.AttributeID
                group 
                by      w.AttributeID
                having  count(*) > 1
            ) as f(Perm)
    where   v.Number < (select top(1) exp(sum(log(count(*))) over()) from _Combinations as x group by x.AttributeID)
)
select  s.Iteration,
        s.AttributeID,
        w.Value     
from    cteSource as s
cross 
apply   (
            select  Value,
                    row_number() over (order by Value) - 1
            from    _Combinations
            where   AttributeID = s.AttributeID
        ) w(Value, recID)
where   coalesce(s.recID / (s.Perm * s.Unq / s.Items), 0) % s.Items = w.recID
order 
by      s.Iteration, s.AttributeId;

score 0 · Accepted Answer

の問題に関してはexp(sum(log(count(*))) over ())、ミックスに ROUND 機能を導入することが私の答えのように思えました。したがって、次のスニペットは信頼できる回答を生成するようです (少なくともこれまでのところ):

ROUND(exp(sum(log(count(*))) over ()), 0)

score 0 · Accepted Answer

再帰的な解決策

以下は再帰的なソリューションです。SQLFiddle はこちらです。

with a as ( -- unique AttributeIDs
  select AttributeID
        ,Row_Number() over(order by AttributeID) as rowNo
        ,count(*) as cnt
    from [dbo].[_Combinations]
  group by AttributeID
),
r as (
  -- start recursion: list all values of the first attribute
  select Dense_Rank() over(order by c.[Value]) - 1 as GroupKey
        ,c.AttributeID
        ,c.[Value]
        ,a.cnt as factor
        ,1 as level
    from a
         join [dbo].[_Combinations] as c on a.AttributeID = c.AttributeID
   where a.rowNo = 1

  union all

  -- recursion step: add the combinations with the values of the next attribute
  select GroupKey
        ,case when AttributeID = 'prev' then prevAttribID else currAttribID end as AttributeID
        ,[Value]
        ,factor
        ,level
    from (select r.Value as prev
                ,c.Value as curr
                ,(Dense_Rank() over(order by c.[Value]) - 1) * r.factor + r.GroupKey as GroupKey
                ,r.level + 1 as level
                ,r.factor * a.cnt as factor
                ,r.AttributeID as prevAttribID
                ,a.AttributeID as currAttribID
            from r
                 join a on r.level + 1 = a.rowNo
                 join [dbo].[_Combinations] as c on a.AttributeID = c.AttributeID
         ) as p
         unpivot ( Value for AttributeID in (prev, curr)) as up
)
-- get result: this is the data from the deepest level
select distinct
       GroupKey + 1 as GroupKey -- start with one instead of zero
      ,AttributeID
      ,[Value]
  from r
 where level = (select count(*) from a)
order by GroupKey, AttributeID, [Value]

動的ソリューション

そして、これは動的ステートメントを使用したわずかに短いバージョンです。

declare @stmt varchar(max);
with a as ( -- unique attribute keys, cast here to avoid casting when building the dynamic statement
  select distinct cast(AttributeID as varchar(10)) as ID
    from [dbo].[_Combinations]
)
select @stmt = 'select GroupKey, Cast(SubString(AttributeIDStr, 2, 100) as int) as AttributeID, Value
  from
  (
  select '
  + (select ' C' + ID + '.Value as V' + ID + ', ' from a for xml path(''))
  + ' Row_Number() over(order by '
  + stuff((select ', C' + ID + '.Value' from a for xml path('')), 1, 2, '')
  + ') AS GroupKey from '
  + stuff((select ' cross join [dbo].[_Combinations] as C' + ID from a for xml path('')), 1, 11, '')
  + ' where ' 
  + stuff((select ' and C' + ID + '.AttributeID = ' + ID from a for xml path('')), 1, 4, '')
  + ')  as p unpivot (Value for AttributeIDStr in ('
  + stuff((select ', V' + ID from a for xml path('')), 1, 2, '')
  + ')) as up'
;
exec (@stmt)

SQL Server には、他のデータベースにあるナイスリスト集計関数がないため、醜いstuff((select ... for xml path('')))式を使用する必要があります。

サンプルデータに対して作成されたステートメントは、空白の違いを除いて、次のとおりです。

select GroupKey, Cast(SubString(AttributeIDStr, 2, 100) as int) as AttributeID, Value
  from
  (
  select C16.Value as V16
        ,C28.Value as V28
        ,C8.Value  as V8
        ,Row_Number() over(order by C16.Value, C28.Value, C8.Value) AS GroupKey
    from [dbo].[_Combinations] as C16
         cross join
         [dbo].[_Combinations] as C28
         cross join
         [dbo].[_Combinations] as C8
   where C16.AttributeID = 16
     and C28.AttributeID = 28
     and C8.AttributeID = 8
  )  as p
  unpivot ( Value for AttributeIDStr in (V16, V28, V8)) as up

exp(log())どちらのソリューションも、丸め誤差に非常に敏感な他の回答で使用されている乗算集計の回避策を回避します。

sql - 動的SQL（SQLサーバー）なしで達成しようとしています

5 に答える 5

Related

Reference