0

2 つのサブドメインにまたがってミラーリングされている Web サイトがあります。そのため、両方に個別の分析データセットがあります。次のテーブルがあります。

|------------------------------|
| table_a                      |
|------------------------------|
| url             | mod_date   |
|------------------------------|
| /foo/index.html | 2009-10-24 |
| /bar/index.php  | 2010-01-04 |
| /foo/bar.html   | 2009-01-04 |
|------------------------------|

|-----------------------------------------|
| table_b                                 |
|-----------------------------------------|
| url             | views | access_date   |
|-----------------------------------------|
| /foo/index.html | 35000 | 2009-12-01    |
| /foo/index.html | 20000 | 2010-02-01    |
| /bar/index.php  | 35000 | 2010-01-01    |
| /bar/index.php  | 15000 | 2011-01-01    |
|-----------------------------------------|

|-----------------------------------------|
| table_c                                 |
|-----------------------------------------|
| url             | views | access_date   |
|-----------------------------------------|
| /foo/index.html | 35000 | 2009-10-01    |
| /foo/bar.html   | 10000 | 2011-05-01    |
| /bar/index.php  | 35000 | 2011-08-01    |
| /bar/index.php  | 15000 | 2012-04-01    |
|-----------------------------------------|

次のクエリがあります。

SELECT 
    a.url
    ,DATE_FORMAT(a.mod_date, '%d/%m/%Y') AS 'mod_date'
    ,DATE_FORMAT(MIN(b.access_date), '%d/%m/%Y') AS 'first_date'
    ,DATE_FORMAT(MAX(b.access_date), '%d/%m/%Y') AS 'last_date'
    ,SUM(ifnull(b.pages,0)) + SUM(ifnull(c.pages,0)) AS 'page_views'    
    ,DATEDIFF(MAX(b.access_date),MIN(b.access_date)) AS 'days'
    ,ROUND(SUM(b.pages) / (DATEDIFF(MAX(b.access_date),MIN(b.access_date)) / 30.44)) AS 'b_mean_monthly_hits'
    ,ROUND(SUM(c.pages) / (DATEDIFF(MAX(c.access_date),MIN(c.access_date)) / 30.44)) AS 'a_mean_monthly_hits'
FROM
    tabl_a a
        LEFT JOIN
    table_b b ON b.url = a.url
        LEFT JOIN
    table_c c ON c.url = a.url
GROUP BY a.url
HAVING ROUND(SUM(b.pages) / (DATEDIFF(MAX(b.access_date),MIN(b.access_date)) / 30.44)) < 5
AND ROUND(SUM(c.pages) / (DATEDIFF(MAX(c.access_date),MIN(c.access_date)) / 30.44)) < 5
;

私が探している結果は次のとおりです。

|------------------------------------------------------------------------------------------|
| results                                                                                  |
|------------------------------------------------------------------------------------------|
| url             | mod_date   | first_date | last_date  | page_views   | avg_monthly_hits |
|------------------------------------------------------------------------------------------|
| /foo/index.html | 2009-10-24 | 2009-10-01 | 2010-02-01 | 90000        | 22273            |
| /bar/index.php  | 2010-01-04 | 2010-01-01 | 2012-04-01 | 85000        | 3275             |
| /foo/bar.html   | 2009-01-04 | 2011-05-01 | 2011-06-01 | 10000        | 9819             |
|------------------------------------------------------------------------------------------|

ここで、 'avg_monthly_hits'は、b.viewsc.views ( 'page_views'として) の合計を、 table_bまたはtable_cからの最も古い access_date と最も新しいaccess_dateの間の日数 (月の取得方法がわからない) で割ったものです。 30.44 (1 か月の平均日数)。

私は自分自身を完全に説明したことを願っています。:)

4

3 に答える 3

0

このクエリを試してください。それをテストするための日付があるといいですね

select
  a.*,
  b.MinDate as `FirstDate`,
  b.MaxDate as `LastDate`,
  (ifnull(b.PSum,0) + ifnull(c.QSum,0)) as `TotalViews`,
  datediff(b.MaxDate,b.MinDate) as `Diff`,
  (((ifnull(b.PSum,0) + ifnull(c.QSum,0))/datediff(b.MaxDate,b.MinDate))/30.44) as `BMonthlyHits`,
  (((ifnull(b.PSum,0) + ifnull(c.QSum,0))/datediff(b.MaxDate,b.MinDate))/30.44) as `CMonthlyHits`
from table_a as a
left join (select url , min(access_date) as MinDate,max(access_date)as MaxDate,sum(pages) as PSum from table_b group by url) as b on a.url = b.url
left join (select url , min(access_date)as MinDate,max(access_date)as MaxDate, sum(pages) as QSum from table_c group by url) as c on a.url = c.url
group by a.url
HAVING BMonthlyHits < 5 and CMonthlyHits < 5
于 2012-07-04T13:39:38.683 に答える
0

table_b と table_c が同じ構造の場合は、結合するだけです

SELECT
 a.url,
 DATE_FORMAT(a.mod_date, '%d/%m/%Y') AS 'mod_date',
 DATE_FORMAT(MIN(u.access_date), '%d/%m/%Y') AS 'first_date',
 DATE_FORMAT(MAX(u.access_date), '%d/%m/%Y') AS 'last_date',
 SUM(u.views) AS 'page_views',
 DATEDIFF(MAX(u.access_date), MIN(u.access_date)) AS 'days',
 ROUND(SUM(u.views) / (DATEDIFF(MAX(u.access_date),MIN(u.access_date)) / 30.44)) AS 'avg_monthly_hits'
FROM table_a AS a 
LEFT JOIN (
   (SELECT * FROM table_b) 
   UNION 
   (SELECT * FROM table_c)
) AS u USING (url)
GROUP BY a.url
HAVING avg_monthly_hits < 5
于 2012-07-04T13:58:12.247 に答える
0

最後に、ネストされたクエリが問題を解決しました。

SELECT DISTINCT a.url
, q.mod_date
, IF(q.b_min_date < q.c_min_date, q.b_min_date, q.c_min_date) AS 'min_date'
, IF(q.b_max_date > q.c_max_date, q.b_max_date, q.c_max_date) AS 'max_date'
, (PERIOD_DIFF(DATE_FORMAT(IF(q.b_max_date > q.c_max_date, q.b_max_date, q.c_max_date), '%Y%m'),DATE_FORMAT(IF(q.b_min_date < q.c_min_date, q.b_min_date, q.c_min_date), '%Y%m')) + 1) AS 'months'
, q.page_views
, ROUND(q.page_views / ((PERIOD_DIFF(DATE_FORMAT(IF(q.b_max_date > q.c_max_date, q.b_max_date, q.c_max_date), '%Y%m'),DATE_FORMAT(IF(q.b_min_date < q.c_min_date, q.b_min_date, q.c_min_date), '%Y%m'))) + 1)) AS 'avg_monthly_hits'
FROM table_a a
INNER JOIN
    (SELECT 
            a.url,
                a.date AS 'mod_date',
                MIN(b.date) AS 'b_min_date',
                MAX(b.date) AS 'b_max_date',
                MIN(c.date) AS 'c_min_date',
                MAX(c.date) AS 'c_max_date',
                SUM(ifnull(b.pages, 0)) + SUM(ifnull(c.pages, 0)) AS 'page_views'
        FROM
            table_a a
                LEFT JOIN
            table_b b ON a.url = b.url
                LEFT JOIN
            table_c c ON a.url = c.url
        GROUP BY a.url
) q
ON a.url = q.url
WHERE ROUND(q.page_views / ((PERIOD_DIFF(DATE_FORMAT(IF(q.b_max_date > q.c_max_date, q.b_max_date, q.c_max_date), '%Y%m'),DATE_FORMAT(IF(q.b_min_date < q.c_min_date, q.b_min_date, q.c_min_date), '%Y%m'))) + 1)) < 5
;
于 2012-07-05T02:08:28.620 に答える