1

私はサッカープールのウェブサイトを持っています。毎週、友達が各ゲームの勝者を選びます。各プレイヤーのピックを他のプレイヤーと比較し、類似度をリストしたいと思います。特定の週の類似度を計算するのに役立つこのページを見つけました: Compare group of tags to find similarity/score with PHP/MySQL . Ivar Bonsaksenへの称賛、彼のソリューションはうまく機能しました!

ここでやりたいことは、過去数週間の各プレイヤーの累積類似度を表示することです。

照会する 3 つのテーブルがあります: プロファイル (spprofiles)、ゲーム (sp6games)、ピック (sp6picks)。Teams (sp6teams) という別のテーブルを使用してチームの名前を取得しますが、ここでは関係ありません。

Profiles (spprofiles)
+-----------+-------------+
| profileID | profilename |
+-----------+-------------+
| 52        | My Team A   |
| 53        | Some Team B |
+-----------+-------------+

Games (sp6games)
+--------+--------+---------+------+
| gameID | weekID | visitor | home |
+--------+--------+---------+------+
| 1      | 2      | 9       | 21   |
| 2      | 2      | 14      | 6    |
| 17     | 3      | 6       | 9    |
| 18     | 3      | 30      | 21   |
+--------+--------+---------+------+

Picks (sp6picks)
+-----------+--------+------+
| profileID | gameID | pick |
+-----------+--------+------+
| 52        | 1      | 21   |
| 52        | 2      | 6    |
| 52        | 17     | 12   |
| 52        | 18     | 21   |
| 53        | 1      | 9    |
| 53        | 2      | 6    |
| 53        | 17     | 9    |
| 53        | 18     | 21   |
+-----------+--------+------+

今週のクエリは次のようになります。

$weekID = 3; //the current weekID
$profile = 52; //the current ProfileID

SELECT
  targetProfiles.profileID AS targetID,
  sourceProfiles.profileID AS sourceID,
    COUNT(targetProfiles.profileID)
    /
    (((SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = sourceProfiles.profileID AND weekID = $weekID)
      +
    (SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = targetProfiles.profileID AND weekID = $weekID))/2)
  AS similarity
FROM
  spProfiles AS sourceProfiles
  LEFT JOIN
    (SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID = $weekID) AS sourcePicks
    ON (sourcePicks.profileID = sourceProfiles.profileID)
  INNER JOIN
    (SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID = $weekID) AS targetPicks
    ON (sourcePicks.pick = targetPicks.pick AND sourcePicks.profileID != targetPicks.profileID)
  LEFT JOIN
    spProfiles AS targetProfiles
    ON (targetPicks.profileID = targetProfiles.profileID)
WHERE sourceProfiles.profileID = $profile
GROUP BY targetID

このクエリを週ごとに実行すると、次の結果が得られます。

$weekID = 2;
+----------+----------+------------+
| targetID | sourceID | similarity |
+----------+----------+------------+
| 53       | 52       | 0.5000     |
+----------+----------+------------+

$weekID = 3;
+----------+----------+------------+
| targetID | sourceID | similarity |
+----------+----------+------------+
| 53       | 52       | 0.5000     |
+----------+----------+------------+

これまでに作成した累積のクエリは次のようになります (ただし、他のバリエーションもいくつか試しました)。基本的には、WHERE 句を変更して以前の週を含めるようにしweekID <= $weekID、Games テーブルをメインの FROM 句に追加しましたLEFT JOIN sp6games ON (targetPicks.gameID = sp6games.gameID)

$weekID = 3; //the current weekID
$profile = 52; //the current ProfileID

SELECT
  targetProfiles.profileID AS targetID,
  sourceProfiles.profileID AS sourceID,
    COUNT(targetProfiles.profileID)
    /
    (((SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = sourceProfiles.profileID AND weekID <= $weekID)
      +
    (SELECT COUNT(*) FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE profileID = targetProfiles.profileID AND weekID <= $weekID))/2)
  AS similarity
FROM
  spProfiles AS sourceProfiles
  LEFT JOIN
    (SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID <= $weekID) AS sourcePicks
    ON (sourcePicks.profileID = sourceProfiles.profileID)
  INNER JOIN
    (SELECT sp6Picks.* FROM sp6Picks LEFT JOIN sp6Games USING (gameID) WHERE weekID <= $weekID) AS targetPicks
    ON (sourcePicks.pick = targetPicks.pick AND sourcePicks.profileID != targetPicks.profileID)
  LEFT JOIN
    spProfiles AS targetProfiles
    ON (targetPicks.profileID = targetProfiles.profileID)
  LEFT JOIN sp6games ON (targetPicks.gameID = sp6games.gameID)
WHERE sourceProfiles.profileID = $profile
GROUP BY targetID, weekID

結合された結果は 0.5000 になるはずですが、代わりに次のようになります。

$weekID = 3;
+----------+----------+------------+
| targetID | sourceID | similarity |
+----------+----------+------------+
| 53       | 52       | 0.7500     |
+----------+----------+------------+

問題はCOUNT(targetProfiles.profileID)、週全体で正しく合計されないため、similarity値がめちゃくちゃになることです。また、大規模なデータセットではあまり効率的ではないようです。

時間を割いて読んでくれてありがとう。

4

1 に答える 1

2
SELECT   t.profileID                 AS target,
         SUM(s.pick=t.pick)/COUNT(*) AS similarity
FROM     sp6picks s
    JOIN sp6picks t USING (gameID)
    JOIN sp6games g USING (gameID)
WHERE    g.weekID    <= 3
     AND s.profileID != t.profileID
     AND s.profileID  = 52
GROUP BY t.profileID

sqlfiddleで参照してください。

于 2012-10-21T05:47:56.743 に答える