まず、Windows7マシンのMySQL5.5.12にロードされたサンプルデータを次に示します。
mysql> DROP DATABASE IF EXISTS lspuk;
Query OK, 1 row affected (0.00 sec)
mysql> CREATE DATABASE lspuk;
Query OK, 1 row affected (0.00 sec)
mysql> USE lspuk
Database changed
mysql> CREATE TABLE items
-> (
-> id int not null auto_increment,
-> description VARCHAR(30),
-> tags VARCHAR(30),
-> primary key (id),
-> FULLTEXT tags_ftndx (tags)
-> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.04 sec)
mysql> INSERT INTO items (description,tags) VALUES
-> ('the first' ,'tag1 tag3 tag4'),
-> ('the second','tag5 tag1 tag2'),
-> ('the third' ,'tag5 tag1 tag9'),
-> ('the fourth','tag5 tag6 tag2'),
-> ('the fifth' ,'tag4 tag3 tag6'),
-> ('the sixth' ,'tag2 tag3 tag6');
Query OK, 6 rows affected (0.00 sec)
Records: 6 Duplicates: 0 Warnings: 0
mysql>
MySQLでタグの作成がどのように行われているかを確認してください。
mysql> SELECT 'tag1',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag1%' UNION
-> SELECT 'tag2',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag2%' UNION
-> SELECT 'tag3',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag3%' UNION
-> SELECT 'tag4',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag4%' UNION
-> SELECT 'tag5',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag5%' UNION
-> SELECT 'tag6',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag6%' UNION
-> SELECT 'tag9',COUNT(1) tag_count FROM items WHERE tags LIKE '%tag9%';
+------+-----------+
| tag1 | tag_count |
+------+-----------+
| tag1 | 3 |
| tag2 | 3 |
| tag3 | 3 |
| tag4 | 2 |
| tag5 | 3 |
| tag6 | 3 |
| tag9 | 1 |
+------+-----------+
7 rows in set (0.00 sec)
mysql>
注意深く見て、次の事実に注意してください。
- 各行には正確に3つのタグがあります
- タグが要求される順序と、各タグがいくつ存在するかがスコアを左右するようです
tag4を削除してクエリを実行すると、スコアはまったく得られません。
mysql> SELECT *,MATCH(tags) AGAINST ('tag3 tag6') as score FROM items ORDER BY score DESC;
+----+-------------+----------------+-------+
| id | description | tags | score |
+----+-------------+----------------+-------+
| 1 | the first | tag1 tag3 tag4 | 0 |
| 2 | the second | tag5 tag1 tag2 | 0 |
| 3 | the third | tag5 tag1 tag9 | 0 |
| 4 | the fourth | tag5 tag6 tag2 | 0 |
| 5 | the fifth | tag4 tag3 tag6 | 0 |
| 6 | the sixth | tag2 tag3 tag6 | 0 |
+----+-------------+----------------+-------+
6 rows in set (0.00 sec)
評価方法は、トークンフィールドの平均数に基づいているようであり、特定の順序での特定の値の有無がスコアリングに影響します。さまざまなスタイルのスコアリングとタグ指定を適用する場合は、さまざまなスコアに注意してください。
mysql> SELECT *,MATCH(tags) AGAINST ('tag3 tag6 tag4') as score FROM items ORDER BY score DESC;
+----+-------------+----------------+--------------------+
| id | description | tags | score |
+----+-------------+----------------+--------------------+
| 1 | the first | tag1 tag3 tag4 | 0.6700310707092285 |
| 5 | the fifth | tag4 tag3 tag6 | 0.6700310707092285 |
| 2 | the second | tag5 tag1 tag2 | 0 |
| 3 | the third | tag5 tag1 tag9 | 0 |
| 4 | the fourth | tag5 tag6 tag2 | 0 |
| 6 | the sixth | tag2 tag3 tag6 | 0 |
+----+-------------+----------------+--------------------+
6 rows in set (0.00 sec)
mysql> SELECT *,MATCH(tags) AGAINST ('tag3 tag6 tag4' IN BOOLEAN MODE) as score FROM items ORDER BY score DESC;
+----+-------------+----------------+-------+
| id | description | tags | score |
+----+-------------+----------------+-------+
| 5 | the fifth | tag4 tag3 tag6 | 3 |
| 1 | the first | tag1 tag3 tag4 | 2 |
| 6 | the sixth | tag2 tag3 tag6 | 2 |
| 4 | the fourth | tag5 tag6 tag2 | 1 |
| 2 | the second | tag5 tag1 tag2 | 0 |
| 3 | the third | tag5 tag1 tag9 | 0 |
+----+-------------+----------------+-------+
6 rows in set (0.00 sec)
mysql> SELECT *,MATCH(tags) AGAINST ('+tag3 +tag6 +tag4' IN BOOLEAN MODE) as score FROM items ORDER BY score DESC;
+----+-------------+----------------+-------+
| id | description | tags | score |
+----+-------------+----------------+-------+
| 5 | the fifth | tag4 tag3 tag6 | 1 |
| 1 | the first | tag1 tag3 tag4 | 0 |
| 2 | the second | tag5 tag1 tag2 | 0 |
| 3 | the third | tag5 tag1 tag9 | 0 |
| 4 | the fourth | tag5 tag6 tag2 | 0 |
| 6 | the sixth | tag2 tag3 tag6 | 0 |
+----+-------------+----------------+-------+
6 rows in set (0.00 sec)
mysql>
解決策は、BOOLEAN MODEスコアを評価し、次に非BOOLEANMODEスコアを次のように評価するように見えます。
SELECT *,
MATCH(tags) AGAINST ('tag3 tag6 tag4') as score1,
MATCH(tags) AGAINST ('+tag3 +tag6 +tag4' IN BOOLEAN MODE) as score2
FROM items ORDER BY score2 DESC, score1 DESC;
サンプルデータに対する結果は次のとおりです。
mysql> SELECT *,
-> MATCH(tags) AGAINST ('tag3 tag6 tag4') as score1,
-> MATCH(tags) AGAINST ('+tag3 +tag6 +tag4' IN BOOLEAN MODE) as score2
-> FROM items ORDER BY score2 DESC, score1 DESC;
+----+-------------+----------------+--------------------+--------+
| id | description | tags | score1 | score2 |
+----+-------------+----------------+--------------------+--------+
| 5 | the fifth | tag4 tag3 tag6 | 0.6700310707092285 | 1 |
| 1 | the first | tag1 tag3 tag4 | 0.6700310707092285 | 0 |
| 2 | the second | tag5 tag1 tag2 | 0 | 0 |
| 3 | the third | tag5 tag1 tag9 | 0 | 0 |
| 4 | the fourth | tag5 tag6 tag2 | 0 | 0 |
| 6 | the sixth | tag2 tag3 tag6 | 0 | 0 |
+----+-------------+----------------+--------------------+--------+
6 rows in set (0.00 sec)
mysql>
または、プラス記号を使用しないようにすることができます
mysql> SELECT *,
-> MATCH(tags) AGAINST ('tag3 tag6 tag4') as score1,
-> MATCH(tags) AGAINST ('tag3 tag6 tag4' IN BOOLEAN MODE) as score2
-> FROM items ORDER BY score2 DESC, score1 DESC;
+----+-------------+----------------+--------------------+--------+
| id | description | tags | score1 | score2 |
+----+-------------+----------------+--------------------+--------+
| 5 | the fifth | tag4 tag3 tag6 | 0.6700310707092285 | 3 |
| 1 | the first | tag1 tag3 tag4 | 0.6700310707092285 | 2 |
| 6 | the sixth | tag2 tag3 tag6 | 0 | 2 |
| 4 | the fourth | tag5 tag6 tag2 | 0 | 1 |
| 2 | the second | tag5 tag1 tag2 | 0 | 0 |
| 3 | the third | tag5 tag1 tag9 | 0 | 0 |
+----+-------------+----------------+--------------------+--------+
6 rows in set (0.00 sec)
mysql>
いずれにせよ、BOOLEANモードと非BOOLEANモードを同時に組み込む必要があります。