


article_id  |  tag_name
1              C++
1              java
1              python
2              ruby
2              js
3              ruby
4              java
4              python


tag1     |   tag2    |  degree


  • 「RAWSQL」を記述して、最初のテーブルの内容に従ってテーブル「tag_relations」に値を書き込みます。


tag1     |   tag2    |  degree
java         C++        2
java         python     4
ruby         js         2



article_id  |  tag_name
1              C++
1              java

したがって、次数= 2


article_id  |  tag_name
1              java
1              python
4              java
4              python

したがって、次数= 4

「python」と「C ++」の度数のレコードはありません(つまり、この質問はそのような場合を参照していませんdegree < 2)(編集者による注:このコメントは、出力にC++とJavaを含めることと矛盾しています。 )。




2 に答える 2




insert into tag_relations(tag1, tag2, degree)
    select a1.tag_name, a2.tag_name, max(article_id) as maxid
    from articles a1 join
         articles a2
         on a1.article_id = a2.article_id and
            a1.tag_name < a2.tag_name
    group by a1.tag_name, a2.tag_name


于 2012-08-29T01:07:59.233 に答える

Since the data in the question is not coherent (or not coherently explained), this answer takes minor liberties with some assumptions that yield plausible looking answers.

Assume that the goal is to list how many times each pair of tags are applied to the same article ID. Given the data:

article_id  |  tag_name
1              c++
1              java
1              python
2              ruby
2              js
3              ruby
4              java
4              python

The expected output might be:

tag1     | tag2    | degree
c++        java      1      -- from 1
java       python    2      -- from 1 and 4
c++        python    1      -- from 1 (missing from question's expected results)
java       ruby      1      -- from 2

The tags are ordered so that tag1 sorts before tag2; this avoids (java, java) appearing, and also prevents (c++, java) appearing along with (java, c++).

Given all this interpretation of the question, we need to develop a query to select the data, and then append that to INSERT INTO tag_relations.

SELECT a.article_id, a.tag_name AS tag1, b.tag_name AS tag2
  FROM articles AS a
  JOIN articles AS b ON a.article_id = b.article_id AND a.tag_name < b.tag_name;

The key concept here is the 'self-join'. The table articles is used twice in the query (and is given two different aliases to clarify which is which), and the table is joined with itself. The details of the join here is quite a common pattern; equality on some columns, but an inequality on one or more others. If there are several columns to order by, the ordering condition gets tricky. Sometimes, that will be a <= instead of <.

This gives the output:

1    c++      java
1    c++      python
1    java     python
2    js       ruby
4    java     python

Now we just need to summarize that with a COUNT for the degree:

SELECT a.tag_name AS tag1, b.tag_name AS tag2, COUNT(*) AS degree
  FROM articles AS a
  JOIN articles AS b ON a.article_id = b.article_id AND a.tag_name < b.tag_name
 GROUP BY tag1, tag2;

If you want to count the reverse relations as well as the forward relations, then you can multiply the count by 2; that would get closer to your expected data.

Finally inserting the data, we get:

INSERT INTO tag_relations(tag1, tag2, degree)
    SELECT a.tag_name AS tag1, b.tag_name AS tag2, COUNT(*) AS degree
      FROM articles AS a
      JOIN articles AS b ON a.article_id = b.article_id AND a.tag_name < b.tag_name
     GROUP BY tag1, tag2;
于 2012-08-29T01:16:00.983 に答える