mysql - MySQL 5.6 でのオプションの深さ制限を使用した再帰クエリ

Question

2 つのテーブルスキーマ (MySQL 5.6 なので CTE なし) があり、おおよそ次のようになります。

CREATE TABLE nodes (
  node_id INT PRIMARY KEY,
  name VARCHAR(10)
);

CREATE TABLE edges (
  edge_id INT PRIMARY KEY,
  source INT,
  target INT,
  FOREIGN KEY (source) REFERENCES nodes(node_id),
  FOREIGN KEY (target) REFERENCES nodes(node_id)
);

私たちの設計では、(論理的に) 2 つのノード間の論理エッジn1 -> n2は、実際n1 -> proxy node -> n2にはデータベース内で ( ) として表されます。論理エッジに 2 つのエッジとプロキシノードを使用する理由は、エッジにプロパティを格納できるようにするためです。したがって、クライアントがエッジで接続された 2 つのノードをクエリすると、クエリは代わりに接続された 3 つのノードをクエリするように変換されます。

固定長のパスを取得するクエリを作成しました。たとえば、「いくつかのプロパティを持つノードで始まり、いくつかのプロパティを持つノードで終わり、パス上に正確に 5 つのエッジがあるすべてのパスを教えてください。」これは、SQL 側で再帰を使用せずに行われます。指定された固定長で長いクエリをプログラムで生成するだけです。

課題は、可変長パスのクエリをサポートしたいということです。たとえば、「いくつかのプロパティを持つノードで始まり、いくつかのプロパティを持つノードで終わるすべてのパスを教えてください。パス上に 3 つ以上のエッジと 10 個以下のエッジがあります。」これは CTE なしで (または CTE を使用しても) 実行可能ですか?

編集：

いくつかのサンプルデータ:

-- Logical nodes are 1, 3, 5, 7, 9, 11. The rest are proxy nodes.
INSERT INTO nodes VALUES
  (1, 'foo'),
  (2, '_proxy_'),
  (3, 'foo'),
  (4, '_proxy_'),
  (5, 'bar'),
  (6, '_proxy_'),
  (7, 'bar'),
  (8, '_proxy_'),
  (9, 'bar'),
  (10, '_proxy_'),
  (11, 'bar');

-- Connects 1 -> 2 -> ... -> 11.
INSERT INTO edges VALUES
  (1, 1, 2),
  (2, 2, 3),
  (3, 3, 4),
  (4, 4, 5),
  (5, 5, 6),
  (6, 6, 7),
  (7, 7, 8),
  (8, 8, 9),
  (9, 9, 10),
  (10, 10, 11);

クエリは、「パスが 'foo' という名前のノードで始まり、'bar' という名前のノードで終わるように、パス上のすべてのノードの ID と名前を選択します。少なくとも 2 つのノードと最大 4 つのノードがあります。道に。」このようなパスには、、、、、1 -> 3 -> 5およびが含ま1 -> 3 -> 5 -> 7れます。したがって、結果セットにはノード 1、3、5、7、9 の ID と名前が含まれているはずです。3 -> 53 -> 5 -> 73 -> 5 -> 7 -> 9

score 2 · Accepted Answer

私はこの種の問題を推移閉包表で解決しました。これにより、ノードを通るすべての直接および間接パスが列挙されます。現在使用しているエッジは長さ 1 のパスです。ただし、長さ 0 のパスも必要です (つまり、ノードにはそれ自体へのパスがあります)。次に、1 つのソースノードから最終的なターゲットノードまでのすべてのパスで、長さが 1 より大きいパスが必要です。 1より。

create table closure (
  source int,
  target int,
  length int,
  is_direct bool,
  primary key (source, target)
);

insert into closure values
  (1, 1, 0, false), (1, 2, 1, true), (1, 3, 2, false), (1, 4, 3, false), (1, 5, 4, false), (1, 6, 5, false), (1, 7, 6, false), (1, 8, 7, false), (1, 9, 8, false), (1, 10, 9, false), (1, 11, 10, false),
  (2, 2, 0, false), (2, 3, 1, true), (2, 4, 2, false), (2, 5, 3, false), (2, 6, 4, false), (2, 7, 5, false), (2, 8, 6, false), (2, 9, 7, false), (2, 10, 8, false), (2, 11, 9, false),
  (3, 3, 0, false), (3, 4, 1, true), (3, 5, 2, false), (3, 6, 3, false), (3, 7, 4, false), (3, 8, 5, false), (3, 9, 6, false), (3, 10, 7, false), (3, 11, 8, false),
  (4, 4, 0, false), (4, 5, 1, true), (4, 6, 2, false), (4, 7, 3, false), (4, 8, 4, false), (4, 9, 5, false), (4, 10, 6, false), (4, 11, 7, false),
  (5, 5, 0, false), (5, 6, 1, true), (5, 7, 2, false), (5, 8, 3, false), (5, 9, 4, false), (5, 10, 5, false), (5, 11, 6, false),
  (6, 6, 0, false), (6, 7, 1, true), (6, 8, 2, false), (6, 9, 3, false), (6, 10, 4, false), (6, 11, 5, false),
  (7, 7, 0, false), (7, 8, 1, true), (7, 9, 2, false), (7, 10, 3, false), (7, 11, 4, false),
  (8, 8, 0, false), (8, 9, 1, true), (8, 10, 2, false), (8, 11, 3, false),
  (9, 9, 0, false), (9, 10, 1, true), (9, 11, 2, true),
  (10, 10, 0, false), (10, 11, 1, true),
  (11, 11, 0, false);

これで、クエリを記述できます。

パスが「foo」という名前のノードで始まり、「bar」という名前のノードで終わるように、パス上のすべてのノードの ID と名前を選択します。パスには少なくとも 2 つのノードと最大 4 つのノードがあります。

それぞれの間にプロキシノードがあるため、これを長さ 4、6、8 のパスに変換します。ノード間を移動するには、実際には 2 つのホップが必要です。

select source.node_id as source_node, target.node_id as target_node, c.length
from nodes as source
join closure as c on source.node_id = c.source
join nodes as target on c.target = target.node_id
where source.name='foo' and target.name = 'bar' and c.length in (4,6,8)

結果は次のとおりです。実際には、ノード 11 も含まれています。

+-------------+-------------+--------+
| source_node | target_node | length |
+-------------+-------------+--------+
|           1 |           5 |      4 |
|           1 |           7 |      6 |
|           1 |           9 |      8 |
|           3 |           7 |      4 |
|           3 |           9 |      6 |
|           3 |          11 |      8 |
+-------------+-------------+--------+

Paul Spiegel からの再コメント:

パスのエンドポイントを取得したら、ソースで開始し、ターゲットへのパスもあるノードで終了するすべてのパスのクロージャをクエリできます。

select source.node_id as source_node, target.node_id as target_node,
  group_concat(i1.target order by i1.target) as interim_nodes
from nodes as source
join closure as c on source.node_id = c.source
join nodes as target on c.target = target.node_id
join closure as i1 on source.node_id = i1.source
join closure as i2 on target.node_id = i2.target and i1.target = i2.source
where source.name='foo' and target.name = 'bar' and c.length in (4,6,8)
group by source.node_id, target.node_id

+-------------+-------------+---------------------+
| source_node | target_node | interim_nodes       |
+-------------+-------------+---------------------+
|           1 |           5 | 1,2,3,4,5           |
|           1 |           7 | 1,2,3,4,5,6,7       |
|           1 |           9 | 1,2,3,4,5,6,7,8,9   |
|           3 |           7 | 3,4,5,6,7           |
|           3 |           9 | 3,4,5,6,7,8,9       |
|           3 |          11 | 3,4,5,6,7,8,9,10,11 |
+-------------+-------------+---------------------+

score 2 · Accepted Answer

次のクエリは、関心のあるすべてのパスをカンマ区切りの文字列で返します。

with recursive rcte as (
  select e.source, e.target, 1 as depth, concat(e.source) as path
  from nodes n
  join edges e on e.source = n.node_id
  where n.name = 'foo' -- start node name
  union all
  select e.source, e.target, r.depth + 1 as depth, concat_ws(',', r.path, e.source)
  from rcte r
  join edges p on p.source = r.target -- p for proxy
  join edges e on e.source = p.target
  where r.depth < 4 -- max path nodes
) 
select r.path
from rcte r
join nodes n on n.node_id = r.source
where r.depth >= 2 -- min path nodes
  and n.name = 'bar' -- end node name

結果は次のようになります。

| path    |
| ------- |
| 3,5     |
| 1,3,5   |
| 3,5,7   |
| 1,3,5,7 |
| 3,5,7,9 |

DB Fiddleで見る

アプリケーションコード内の文字列を解析し、配列をマージ/ユニオンできるようになりました。含まれているノード ID のみが必要な場合は、外側のクエリを次のように変更することもできます。

select distinct r2.source
from rcte r
join nodes n on n.node_id = r.source
join rcte r2 on find_in_set(r2.source, r.path)
where r.depth >= 2 -- min path nodes
  and n.name = 'bar' -- end node name

結果：

| source |
| ------ |
| 1      |
| 3      |
| 5      |
| 7      |
| 9      |

DB Fiddleで見る

行数が多すぎると、JOINFIND_IN_SET()が遅くなる可能性があることに注意してください。rcteこの手順は、手続き型言語では非常に単純なアプリケーションコードで行うことをお勧めします。

MySQL 5.6 ソリューション

MySQL 8.0 および MariaDB 10.2 より前は、再帰の方法がありませんでした。さらに、他にも多くの制限があり、回避策を困難にしています。例えば：

ストアド関数に動的クエリはありません
1 つのステートメントで一時テーブルを 2 回使用する方法はありません
memmoryエンジンに TEXT タイプがありません

ただし、RCTE は、2 つの (一時) テーブル間で行を移動するストアドプロシージャでエミュレートできます。次の手順でそれを行います。

delimiter //
create procedure get_path(
  in source_name text,
  in target_name text,
  in min_depth int,
  in max_depth int
)
begin
  create temporary table tmp_sources (id int, depth int, path text) engine=innodb;
  create temporary table tmp_targets like tmp_sources;

  insert into tmp_sources (id, depth, path)
    select n.node_id, 1, n.node_id
    from nodes n
    where n.name = source_name;

  set @depth = 1;
  while @depth < max_depth do
    set @depth = @depth+1;
    insert into tmp_targets(id, depth, path)
      select e.target, @depth, concat_ws(',', t.path, e.target)
      from tmp_sources t
      join edges p on p.source = t.id
      join edges e on e.source = p.target
      where t.depth = @depth - 1;

    insert into tmp_sources (id, depth, path)
      select id, depth, path
      from tmp_targets;

    truncate tmp_targets;
  end while;

  select t.path
    from tmp_sources t
    join nodes n on n.node_id = t.id
    where n.name = target_name
      and t.depth >= min_depth;
end //
delimiter ;

次のように使用します。

call get_path('foo', 'bar', 2, 4)

結果：

| path    |
| ------- |
| 3,5     |
| 1,3,5   |
| 3,5,7   |
| 1,3,5,7 |
| 3,5,7,9 |

DB Fiddleで見る

これは最適とは言えません。結果に多数または長いパスが含まれる場合は、一時テーブルにいくつかのインデックスを定義する必要がある場合があります。また、ストロードプロシージャで (一時的な) テーブルを作成するという考えも好きではありません。「概念実証」として参照してください。自己責任で使用してください。

mysql - MySQL 5.6 でのオプションの深さ制限を使用した再帰クエリ

2 に答える 2

MySQL 5.6 ソリューション

Related

Reference