sql - 効率的にオーバーライドして継承を処理する

Question

次の2つのデータ構造があります。

まず、オブジェクトトリプルに適用されるプロパティのリスト:

Object1  Object2  Object3 Property  Value
     O1       O2       O3       P1  "abc"
     O1       O2       O3       P2  "xyz"
     O1       O3       O4       P1  "123"
     O2       O4       O5       P1  "098"

次に、継承ツリー:

または関係として表示:

Object    Parent
    O2        O1
    O4        O2
    O3        O1
    O5        O3
    O1      null

これのセマンティクスは、O2 が O1 からプロパティを継承することです。O4 - O2 と O1 から; O3 - O1から; および O5 - O3 および O1 から、その優先順位で。
注 1 : 特定のオブジェクトのすべての子またはすべての親を選択する効率的な方法があります。これは現在、左右のインデックスで実装されていますが、hierarchyid も機能する可能性があります。これは今のところ重要ではないようです。
注 2 : 実際に存在する必要がない (つまり、親または子が定義されていない) 場合でも、「オブジェクト」列に可能なすべてのオブジェクトが常に含まれるようにするティガーを配置しています。これによりinner join、大幅に効率の悪い s ではなく sを使用できるようになりますouter join。

目的は次のとおりです: (Property, Value) のペアが与えられた場合、明示的に定義された、または親から継承された値を持つそのプロパティを持つすべてのオブジェクトトリプルを返します。

注 1 : オブジェクトトリプルは、またはのいずれかが true であり、およびについても同じ場合、トリプルの(X,Y,Z)「親」と見なされます。注 2 : 近い親で定義されたプロパティは、より遠い親で定義された同じプロパティを「オーバーライド」します。注 3 : (A,B,C) に (X1,Y1,Z1) と (X2,Y2,Z2) の 2 つの親がある場合、 ( X1,Y1,Z1) は次の場合に「近い」親と見なされます。) X2 が X1 の親である、または (b) X2 = X1 で Y2 が Y1 の親である、または (c) X2 = X1 で Y2 = Y1 で Z2 が Z1 の親である (A,B,C)X = AX is a parent of A(Y,B)(Z,C)

つまり、トリプルの祖先における「近さ」は、最初にトリプルの最初のコンポーネントに基づいて定義され、次に 2 番目のコンポーネント、次に 3 番目のコンポーネントに基づいて定義されます。このルールは、祖先に関してトリプルの明確な半順序を確立します。

たとえば、(P1, "abc") のペアが与えられた場合、トリプルの結果セットは次のようになります。

 O1, O2, O3     -- Defined explicitly
 O1, O2, O5     -- Because O5 inherits from O3
 O1, O4, O3     -- Because O4 inherits from O2
 O1, O4, O5     -- Because O4 inherits from O2 and O5 inherits from O3
 O2, O2, O3     -- Because O2 inherits from O1
 O2, O2, O5     -- Because O2 inherits from O1 and O5 inherits from O3
 O2, O4, O3     -- Because O2 inherits from O1 and O4 inherits from O2
 O3, O2, O3     -- Because O3 inherits from O1
 O3, O2, O5     -- Because O3 inherits from O1 and O5 inherits from O3
 O3, O4, O3     -- Because O3 inherits from O1 and O4 inherits from O2
 O3, O4, O5     -- Because O3 inherits from O1 and O4 inherits from O2 and O5 inherits from O3
 O4, O2, O3     -- Because O4 inherits from O1
 O4, O2, O5     -- Because O4 inherits from O1 and O5 inherits from O3
 O4, O4, O3     -- Because O4 inherits from O1 and O4 inherits from O2
 O5, O2, O3     -- Because O5 inherits from O1
 O5, O2, O5     -- Because O5 inherits from O1 and O5 inherits from O3
 O5, O4, O3     -- Because O5 inherits from O1 and O4 inherits from O2
 O5, O4, O5     -- Because O5 inherits from O1 and O4 inherits from O2 and O5 inherits from O3

トリプル (O2、O4、O5) がこのリストにないことに注意してください。これは、プロパティ P1 がトリプル (O2、O4、O5) に対して明示的に定義されており、そのトリプルが (O1、O2、O3) からそのプロパティを継承できないためです。また、トリプル (O4、O4、O5) も存在しないことに注意してください。これは、トリプルが (O2, O4, O5) から P1="098" の値を継承するためです。これは、(O1, O2, O3) よりも近い親であるためです。

それを行う簡単な方法は次のとおりです。まず、プロパティが定義されているすべてのトリプルについて、考えられるすべての子トリプルを選択します。

select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value
from TriplesAndProperties tp

-- Select corresponding objects of the triple
inner join Objects as Objects1 on Objects1.Id = tp.O1
inner join Objects as Objects2 on Objects2.Id = tp.O2
inner join Objects as Objects3 on Objects3.Id = tp.O3

-- Then add all possible children of all those objects
inner join Objects as Children1 on Objects1.Id [isparentof] Children1.Id
inner join Objects as Children2 on Objects2.Id [isparentof] Children2.Id
inner join Objects as Children3 on Objects3.Id [isparentof] Children3.Id

しかし、これがすべてではありません。一部のトリプルが複数の親から同じプロパティを継承する場合、このクエリは矛盾する結果をもたらします。したがって、2 番目のステップは、競合する結果の 1 つだけを選択することです。

select * from
(
    select 
        Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value,
        row_number() over( 
            partition by Children1.Id, Children2.Id, Children3.Id, tp.Property
            order by Objects1.[depthInTheTree] descending, Objects2.[depthInTheTree] descending, Objects3.[depthInTheTree] descending
        )
        as InheritancePriority
    from
    ... (see above)
)
where InheritancePriority = 1

ウィンドウ関数row_number() over( ... )は次のことを行います。オブジェクトのトリプルとプロパティの一意の組み合わせごとに、値が継承されたトリプルから親までの祖先の距離によってすべての値を並べ替え、結果のリストの最初のものだけを選択します。値の。GROUP BYandステートメントを使用しても同様の効果が得られORDER BYますが、ウィンドウ関数の方がセマンティックにクリーンであることがわかります (それらが生成する実行プランは同じです)。ポイントは、貢献している最も近い祖先を選択する必要があるということです。そのためには、グループ化してからグループ内で並べ替える必要があります。

最後に、Property と Value で結果セットを簡単にフィルター処理できるようになりました。

このスキームは機能します。非常に信頼性が高く、予測可能です。実装するビジネスタスクに対して非常に強力であることが証明されています。

唯一の問題は、非常に遅いことです。
7 つのテーブルを結合すると速度が低下する可能性があると指摘する人もいるかもしれませんが、実際にはそれがボトルネックではありません。

SQL Management Studio (および SQL プロファイラー) から得た実際の実行計画によると、ボトルネックは並べ替えです。問題は、ウィンドウ関数を満たすために、サーバーがでソートするChildren1.Id, Children2.Id, Children3.Id, tp.Property, Parents1.[depthInTheTree] descending, Parents2.[depthInTheTree] descending, Parents3.[depthInTheTree] descending必要があり、値が複数のテーブルのクロス結合から取得されるため、使用できるインデックスがないことです。

編集: Michael Buen の提案 (ありがとう、Michael) に従って、パズル全体を sqlfiddle hereに投稿しました。実行計画を見ると、Sort 操作がクエリ全体の 32% を占めており、他の操作はすべてインデックスを使用するため、合計行数と共に増加することがわかります。

通常、このような場合はインデックス付きビューを使用しますが、この場合は使用しません。インデックス付きビューには自己結合 (6 つある) を含めることができないためです。

これまでに思いついた唯一の方法は、Objects テーブルのコピーを 6 つ作成し、それらを結合に使用して、インデックス付きビューを有効にすることです。
私がその種のハックに還元される時が来たのですか？絶望が押し寄せる。

score 2 · Accepted Answer

3つの可能な答えがあります。

あなたの質問のSQLフィドルはここにあります：http://sqlfiddle.com/#!3/7c7a0/3/0

私の答えのSQLフィドルはここにあります：http://sqlfiddle.com/#!3/5d257/1

警告:

Query Analyzer だけでは十分ではありません。クエリプランが元のクエリよりも高価であるため、多くの回答が拒否されていることに気付きました。アナライザーは単なるガイドです。実際のデータセット、ハードウェア、およびユースケースによっては、高価なクエリは、安価なクエリよりも速く結果を返すことができます。自分の環境でテストする必要があります。
クエリアナライザーは効果がありません。クエリから「最もコストのかかるステップ」を削除する方法を見つけたとしても、多くの場合、クエリに違いはありません。
クエリの変更だけでスキーマ/設計の問題が軽減されることはめったにありません- 一部の回答は、トリガーや追加のテーブルなどのスキーマレベルの変更を伴うため拒否されました。最適化に抵抗する複雑なクエリは、問題が基礎となる設計または私の期待にあることを示す強力な兆候です。気に入らないかもしれませんが、クエリレベルでは問題を解決できないことを受け入れる必要があるかもしれません。
インデックス付きビューにrow_number()/partitition句を含めることはできません-オブジェクトテーブルの6つのコピーを作成して自己結合の問題を回避するだけでは、提案したインデックス付きビューを作成するには不十分です。この sqlfiddleで試しました。最後の "create index" ステートメントのコメントを外すと、ビューに "ランキングまたは集計ウィンドウ関数が含まれている" ため、エラーが発生します。

有効な回答:

row_number() の代わりに左結合- 左結合を使用するクエリを使用して、ツリーの下位でオーバーライドされた結果を除外できます。このクエリから最終的な "order by" を削除すると、実際にあなたを悩ませてきた並べ替えが削除されます! このクエリの実行プランは元のクエリよりもコストがかかりますが、上記の免責事項 #1 を参照してください。
クエリの一部のインデックス付きビュー- 本格的なクエリマジック (この手法に基づく) を使用して、クエリの一部のインデックス付きビューを作成しました。このビューは、元の質問クエリまたは回答 #1 を強化するために使用できます。
Actualize into a well indexed table - 他の誰かがこの回答を提案しましたが、彼らはそれをうまく説明していない可能性があります。結果セットが非常に大きい場合や、ソーステーブルを頻繁に更新する場合を除き、クエリの結果を実現し、トリガーを使用してそれらを最新の状態に保つことは、この種の問題を回避するための最適な方法です。クエリのビューを作成したら、このオプションを簡単にテストできます。回答 #2 を再利用してトリガーを高速化し、時間の経過とともにさらに改善することができます。(テーブルの6 つのコピーを作成することについて話しているので、最初にこれを試してください。これにより、関心のある選択のパフォーマンスが可能な限り向上することが保証されます。)

sqlfiddle からの私の回答のスキーマ部分は次のとおりです。

Create Table Objects
(
    Id int not null identity primary key,
    LeftIndex int not null default 0,
    RightIndex int not null default 0
)

alter table Objects add ParentId int null references Objects

CREATE TABLE TP
(
    Object1 int not null references Objects,
    Object2 int not null references Objects,
    Object3 int not null references Objects,
    Property varchar(20) not null,
    Value varchar(50) not null
)


insert into Objects(LeftIndex, RightIndex) values(1, 10)
insert into Objects(ParentId, LeftIndex, RightIndex) values(1, 2, 5)
insert into Objects(ParentId, LeftIndex, RightIndex) values(1, 6, 9)
insert into Objects(ParentId, LeftIndex, RightIndex) values(2, 3, 4)
insert into Objects(ParentId, LeftIndex, RightIndex) values(3, 7, 8)

insert into TP(Object1, Object2, Object3, Property, Value) values(1,2,3, 'P1', 'abc')
insert into TP(Object1, Object2, Object3, Property, Value) values(1,2,3, 'P2', 'xyz')
insert into TP(Object1, Object2, Object3, Property, Value) values(1,3,4, 'P1', '123')
insert into TP(Object1, Object2, Object3, Property, Value) values(2,4,5, 'P1', '098')

create index ix_LeftIndex on Objects(LeftIndex)
create index ix_RightIndex on Objects(RightIndex)
create index ix_Objects on TP(Property, Value, Object1, Object2, Object3)
create index ix_Prop on TP(Property)
GO

---------- QUESTION ADDITIONAL SCHEMA --------
CREATE VIEW TPResultView AS
Select O1, O2, O3, Property, Value
FROM
(
    select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value,

    row_number() over( 
        partition by Children1.Id, Children2.Id, Children3.Id, tp.Property
        order by Objects1.LeftIndex desc, Objects2.LeftIndex desc, Objects3.LeftIndex desc
    )
    as Idx

    from tp

    -- Select corresponding objects of the triple
    inner join Objects as Objects1 on Objects1.Id = tp.Object1
    inner join Objects as Objects2 on Objects2.Id = tp.Object2
    inner join Objects as Objects3 on Objects3.Id = tp.Object3

    -- Then add all possible children of all those objects
    inner join Objects as Children1 on Children1.LeftIndex between Objects1.LeftIndex and Objects1.RightIndex
    inner join Objects as Children2 on Children2.LeftIndex between Objects2.LeftIndex and Objects2.RightIndex
    inner join Objects as Children3 on Children3.LeftIndex between Objects3.LeftIndex and Objects3.RightIndex
) as x
WHERE idx = 1 
GO

---------- ANSWER 1 SCHEMA --------

CREATE VIEW TPIntermediate AS
select tp.Property, tp.Value 
    , Children1.Id as O1, Children2.Id as O2, Children3.Id as O3
    , Objects1.LeftIndex as PL1, Objects2.LeftIndex as PL2, Objects3.LeftIndex as PL3    
    , Children1.LeftIndex as CL1, Children2.LeftIndex as CL2, Children3.LeftIndex as CL3    
    from tp

    -- Select corresponding objects of the triple
    inner join Objects as Objects1 on Objects1.Id = tp.Object1
    inner join Objects as Objects2 on Objects2.Id = tp.Object2
    inner join Objects as Objects3 on Objects3.Id = tp.Object3

    -- Then add all possible children of all those objects
    inner join Objects as Children1 WITH (INDEX(ix_LeftIndex)) on Children1.LeftIndex between Objects1.LeftIndex and Objects1.RightIndex
    inner join Objects as Children2 WITH (INDEX(ix_LeftIndex)) on Children2.LeftIndex between Objects2.LeftIndex and Objects2.RightIndex
    inner join Objects as Children3 WITH (INDEX(ix_LeftIndex)) on Children3.LeftIndex between Objects3.LeftIndex and Objects3.RightIndex
GO

---------- ANSWER 2 SCHEMA --------

-- Partial calculation using an indexed view
-- Circumvented the self-join limitation using a black magic technique, based on 
-- http://jmkehayias.blogspot.com/2008/12/creating-indexed-view-with-self-join.html
CREATE TABLE dbo.multiplier (i INT PRIMARY KEY)

INSERT INTO dbo.multiplier VALUES (1) 
INSERT INTO dbo.multiplier VALUES (2) 
INSERT INTO dbo.multiplier VALUES (3) 
GO

CREATE VIEW TPIndexed
WITH SCHEMABINDING
AS

SELECT tp.Object1, tp.object2, tp.object3, tp.property, tp.value,
    SUM(ISNULL(CASE M.i WHEN 1 THEN Objects.LeftIndex ELSE NULL END, 0)) as PL1,
    SUM(ISNULL(CASE M.i WHEN 2 THEN Objects.LeftIndex ELSE NULL END, 0)) as PL2,
    SUM(ISNULL(CASE M.i WHEN 3 THEN Objects.LeftIndex ELSE NULL END, 0)) as PL3,
    SUM(ISNULL(CASE M.i WHEN 1 THEN Objects.RightIndex ELSE NULL END, 0)) as PR1,
    SUM(ISNULL(CASE M.i WHEN 2 THEN Objects.RightIndex ELSE NULL END, 0)) as PR2,
    SUM(ISNULL(CASE M.i WHEN 3 THEN Objects.RightIndex ELSE NULL END, 0)) as PR3,
    COUNT_BIG(*) as ID
    FROM dbo.tp
    cross join dbo.multiplier M 
    inner join dbo.Objects 
    on (M.i = 1 AND Objects.Id = tp.Object1)
    or (M.i = 2 AND Objects.Id = tp.Object2)
    or (M.i = 3 AND Objects.Id = tp.Object3)
GROUP BY tp.Object1, tp.object2, tp.object3, tp.property, tp.value
GO

-- This index is mostly useless but required
create UNIQUE CLUSTERED index pk_TPIndexed on dbo.TPIndexed(property, value, object1, object2, object3)
-- Once we have the clustered index, we can create a nonclustered that actually addresses our needs
create NONCLUSTERED index ix_TPIndexed on dbo.TPIndexed(property, value, PL1, PL2, PL3, PR1, PR2, PR3)
GO

-- NOTE: this View is not indexed, but is uses the indexed view 
CREATE VIEW TPIndexedResultView AS
Select O1, O2, O3, Property, Value
FROM
(
    select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value,

    row_number() over( 
        partition by tp.Property, Children1.Id, Children2.Id, Children3.Id
        order by tp.Property, Tp.PL1 desc, Tp.PL2 desc, Tp.PL3 desc
    )
    as Idx

    from TPIndexed as TP WITH (NOEXPAND)

    -- Then add all possible children of all those objects
    inner join Objects as Children1 WITH (INDEX(ix_LeftIndex)) on Children1.LeftIndex between TP.PL1 and TP.PR1
    inner join Objects as Children2 WITH (INDEX(ix_LeftIndex)) on Children2.LeftIndex between TP.PL2 and TP.PR2
    inner join Objects as Children3 WITH (INDEX(ix_LeftIndex)) on Children3.LeftIndex between TP.PL3 and TP.PR3
) as x
WHERE idx = 1 
GO


-- NOTE: this View is not indexed, but is uses the indexed view 
CREATE VIEW TPIndexedIntermediate AS
select tp.Property, tp.Value 
    , Children1.Id as O1, Children2.Id as O2, Children3.Id as O3
    , PL1, PL2, PL3    
    , Children1.LeftIndex as CL1, Children2.LeftIndex as CL2, Children3.LeftIndex as CL3    
    from TPIndexed as TP WITH (NOEXPAND)

    -- Then add all possible children of all those objects
    inner join Objects as Children1 WITH (INDEX(ix_LeftIndex)) on Children1.LeftIndex between TP.PL1 and TP.PR1
    inner join Objects as Children2 WITH (INDEX(ix_LeftIndex)) on Children2.LeftIndex between TP.PL2 and TP.PR2
    inner join Objects as Children3 WITH (INDEX(ix_LeftIndex)) on Children3.LeftIndex between TP.PL3 and TP.PR3  
GO


---------- ANSWER 3 SCHEMA --------
-- You're talking about making six copies of the TP table
-- If you're going to go that far, you might as well, go the trigger route
-- The performance profile is much the same - slower on insert, faster on read
-- And instead of still recalculating on every read, you'll be recalculating
-- only when the data changes. 

CREATE TABLE TPResult
(
    Object1 int not null references Objects,
    Object2 int not null references Objects,
    Object3 int not null references Objects,
    Property varchar(20) not null,
    Value varchar(50) not null
)
GO

create UNIQUE index ix_Result on TPResult(Property, Value, Object1, Object2, Object3)


--You'll have to imagine this trigger, sql fiddle doesn't want to do it
--CREATE TRIGGER tr_TP
--ON TP
--  FOR INSERT, UPDATE, DELETE
--AS
--  DELETE FROM TPResult
-- -- For this example we'll just insert into the table once
INSERT INTO TPResult 
SELECT O1, O2, O3, Property, Value 
FROM TPResultView

sqlfiddle からの私の回答の一部をクエリします。

-------- QUESTION QUERY ----------
-- Original query, modified to use the view I added
SELECT O1, O2, O3, Property, Value 
FROM TPResultView
WHERE property = 'P1' AND value = 'abc'
-- Your assertion is that this order by is the most expensive part. 
-- Sometimes converting queries into views allows the server to
-- Optimize them better over time.
-- NOTE: removing this order by has no effect on this query.
-- ORDER BY O1, O2, O3
GO

-------- ANSWER 1  QUERY ----------
-- A different way to get the same result. 
-- Query optimizer says this is more expensive, but I've seen cases where
-- it says a query is more expensive but it returns results faster.
SELECT O1, O2, O3, Property, Value
FROM (
  SELECT A.O1, A.O2, A.O3, A.Property, A.Value
  FROM TPIntermediate A
  LEFT JOIN TPIntermediate B ON A.O1 = B.O1
    AND A.O2 = B.O2
    AND A.O3 = B.O3
    AND A.Property = B.Property
    AND 
    (
      -- Find any rows with Parent LeftIndex triplet that is greater than this one
      (A.PL1 < B.PL1
      AND A.PL2 < B.PL2
      AND A.PL3 < B.PL3) 
    OR
      -- Find any rows with LeftIndex triplet that is greater than this one
      (A.CL1 < B.CL1
      AND A.CL2 < B.CL2
      AND A.CL3 < B.CL3)
    )
  -- If this row has any rows that match the previous two cases, exclude it
  WHERE B.O1 IS NULL ) AS x
WHERE property = 'P1' AND value = 'abc'
-- NOTE: Removing this order _DOES_ reduce query cost removing the "sort" action
-- that has been the focus of your question.   
-- Howeer, it wasn't clear from your question whether this order by was required.
--ORDER BY O1, O2, O3
GO

-------- ANSWER 2  QUERIES ----------
-- Same as above but using an indexed view to partially calculate results

SELECT O1, O2, O3, Property, Value 
FROM TPIndexedResultView
WHERE property = 'P1' AND value = 'abc'
-- Your assertion is that this order by is the most expensive part. 
-- Sometimes converting queries into views allows the server to
-- Optimize them better over time.
-- NOTE: removing this order by has no effect on this query.
--ORDER BY O1, O2, O3
GO

SELECT O1, O2, O3, Property, Value
FROM (
  SELECT A.O1, A.O2, A.O3, A.Property, A.Value
  FROM TPIndexedIntermediate A
  LEFT JOIN TPIndexedIntermediate B ON A.O1 = B.O1
    AND A.O2 = B.O2
    AND A.O3 = B.O3
    AND A.Property = B.Property
    AND 
    (
      -- Find any rows with Parent LeftIndex triplet that is greater than this one
      (A.PL1 < B.PL1
      AND A.PL2 < B.PL2
      AND A.PL3 < B.PL3) 
    OR
      -- Find any rows with LeftIndex triplet that is greater than this one
      (A.CL1 < B.CL1
      AND A.CL2 < B.CL2
      AND A.CL3 < B.CL3)
    )
  -- If this row has any rows that match the previous two cases, exclude it
  WHERE B.O1 IS NULL ) AS x
WHERE property = 'P1' AND value = 'abc'
-- NOTE: Removing this order _DOES_ reduce query cost removing the "sort" action
-- that has been the focus of your question.   
-- Howeer, it wasn't clear from your question whether this order by was required.
--ORDER BY O1, O2, O3
GO



-------- ANSWER 3  QUERY ----------
-- Returning results from a pre-calculated table is fast and easy
-- Unless your are doing many more inserts than reads, or your result
-- set is very large, this is a fine way to compensate for a poor design
-- in one area of your database.
SELECT Object1 as O1, Object2 as O2, Object3 as O3, Property, Value 
FROM TPResult
WHERE property = 'P1' AND value = 'abc'
ORDER BY O1, O2, O3

score 0 · Accepted Answer

キャッシングは、クエリを高速化するための鍵です。それはあなたがしなければならない計算を減らします。CACHEを実行し、 WORKを保存するため、インデックスを作成します。これを行うには、以下の2つの可能性があります。

オプション1

SQLデータベースは、ウィンドウ関数のためにソートされます。そして、あなたはウィンドウ関数がそのままでは遅すぎると言います。

これがどれだけうまくいくかはわかりませんが、うまくいくかもしれません。

列の数で並べ替える代わりに、単一の列で並べ替えを試すことができます-「近さ」。

今のところ、近さを抽象的な整数として定義しましょう。ウィンドウ関数の代わりに、次のSQLを使用できます。

select * from
(
    select
        Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value,

        row_number() over( 
            partition by Children1.Id, Children2.Id, Children3.Id, tp.Property
            order by closeness DESC
        )
        as InheritancePriority
    from
    ... (see above)
)
where InheritancePriority = 1

近さは、TriplesAndPropertiesテーブルで定義された列にすることができます。オブジェクトごとに、ルートノード（O1）からの距離として「近さ」を定義できます。次に、定義することができますcloseness(tuple) = closeness(Object1)*100+closeness(Object2)*10+closeness(Object3)

このように、ルートから最も遠いタプルが必要です。

並べ替えを回避するには、近さのインデックスが作成されていることを確認する必要があります。

オプション2

私はこれがうまくいくと確信しています。

TriplesAndPropertiesテーブルを定義して、次の列を作成しますObject1, Object2, Object3, Property, Value, Effective_Object1, Effective_Object2, Effective_Object3, Closeness。

ここでは、近接性も列として定義していることに注意してください。

タプルをテーブル（X、Y、Z）に挿入/更新するときは、代わりに次を挿入します。

(X,Y,Z,Property,Value,X,Y,Z,0)
(X,Y,Z,Property,Value,X,Y,Z.child,1)
(X,Y,Z,Property,Value,X,Y,Z.grandchild,2)
(X,Y,Z,Property,Value,X,Y.child,Z,10)
(X,Y,Z,Property,Value,X,Y.child,Z.child,11)
(X,Y,Z,Property,Value,X,Y.child,Z.grandchild,12)
(X,Y,Z,Property,Value,X,Y.grandchild,Z,20)
(X,Y,Z,Property,Value,X,Y.grandchild,Z.child,21)
(X,Y,Z,Property,Value,X,Y.grandchild,Z.grandchild,22)
...
...

これは、テーブル内の1つの行を挿入/更新/破棄する代わりに、最大20行を挿入することを意味します。それは悪くないです。

その後、クエリは非常に簡単です。

あなたはただ言う：

SELECT * FROM
    (
    SELECT Effective_Object1, Effective_Object2, Effective_Object3, Property, Value,
        row_number() over( 
            partition by Effective_Object1, Effective_Object2, Effective_Object3, Property
            order by Closeness DESC
        ) AS InheritancePriority FROM TriplesAndProperties
     ) WHERE InheritancePriority = 1;

このオプションでは、近さのインデックスが作成されていることを確認する必要があります。タプル（Effective_Object1、Effective_Object2、Effective_Object3、Property、Closeness）でインデックスを作成できます。

どちらの場合も、ある程度のキャッシュがあります。これは、追加情報を追加しないデータですが、一定量の計算または作業をキャッシュします。

score 0 · Accepted Answer

あなたのテーブルはかなり大きいと思います。したがって、遅さ。その場合、複数のプロパティ (2 から多数) があると推測しています。この場合、「where property= 'P1'」を CTE 内に移動することをお勧めします。これにより、データのかなりの部分がフィルター処理され、クエリが最大でプロパティの数倍速くなります。

のようなもの: http://sqlfiddle.com/#!3/7c7a0/92/0

score 0 · Accepted Answer

「Value」列を最初に、「Property」列を 2 番目に、「Object1」列を 3 番目に、「Object2」列を 4 番目に、「Object3」列を 5 番目にして、インデックスを試しましたか (または pk を設定しましたか)。「値」は「プロパティ」よりも制限的であると想定しています。

また、Id 列を主キーとして設定し、ParentId と Id の間に外部キーの関係があると仮定しています。

このクエリはどのように実行されますか?:

    with 
    -- First, get all combinations that match the property/value pair.
    validTrip as (
        select Object1, Object2, Object3
        from TriplesAndProperties 
        where value = @value
            and property = @property
    ),
    -- Recursively flatten the inheritance hierarchy of Object1, 2 and 3.
    o1 as (
        select Id, 0 as InherLevel from Objects where Id in (select Object1 from validTrip)
        union all
        select rec.Id, InherLevel + 1 from Objects rec inner join o1 base on rec.Parent = base.[Object]
    ),
    o2 as (
        select Id, 0 as InherLevel from Objects where Id in (select Object2 from validTrip)
        union all
        select rec.Id, InherLevel + 1 from Objects rec inner join o2 base on rec.Parent = base.[Object]
    ),
    o3 as (
        select Id, 0 as InherLevel from Objects where Id in (select Object3 from validTrip)
        union all
        select rec.Id, InherLevel + 1 from Objects rec inner join o3 base on rec.Parent = base.[Object]
   )
    -- select the Id triple.
    select o1.Id, o2.Id, o3.Id N
    -- match every option in o1, with every option in o2, with every option in o3.
    from o1
        cross join o2
        cross join o3
    -- order by the inheritance level.
    order by o1.InherLevel, o2.InherLevel, o3.InherLevel;

score 0 · Accepted Answer

階層クエリ、つまり、この場合はあなたの友人のWITH RECURSIVE ...ような独自の同等物です。CONNECT BY

特定の問題を解決するためのレシピは次のとおりです。葉から始めて、すでに見つかったものを集約して除外するルートに上昇します。

score 0 · Accepted Answer

インデックス付きテーブルで結合を実体化することで、これを高速化できます (joinedresult など)。これには、スペースが必要であり、ディスクに保存するという欠点があります。ただし、遅い部分にインデックスを使用できるという利点があります。

insert into joinedresult
select Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value,Objects1.[depthInTheTree] as O1D,Objects2.[depthInTheTree] as O2D,Objects3. depthInTheTree]  as O3D from  ... (see above)

Joinedresult が [O1,O2,O3,Property,O1D,O2D,O3D] にインデックスを持っていることを確認し、実行前にクリアしてください。それで

select * from
(
    select 
    Children1.Id as O1, Children2.Id as O2, Children3.Id as O3, tp.Property, tp.Value,
    row_number() over( 
        partition by Children1.Id, Children2.Id, Children3.Id, tp.Property
        order by O1D descending, O2D descending, O3D descending
    )
    as InheritancePriority
    from joinedresult
)
where InheritancePriority = 1

sql - 効率的にオーバーライドして継承を処理する

6 に答える 6

オプション1

オプション2

Related

Reference