In an SQLite database table with two columns 'mID', and 'stars', I have to return 'mID's with highest average values of 'stars'.
Having the following data:
Rating mID stars 101 2 101 4 106 4 103 2 108 4 108 2 101 3 103 3 104 2 108 4 107 3 106 5 107 5 104 3
I would first take average of 'stars' of each 'mID' by grouping it by 'mID', such as
select mID, avg(stars) theAvg
from Rating
group by mID;
As a result, I would get the table of average 'stars' values for each 'mID'.
mID avg(stars) 101 3.0 103 2.5 104 2.5 106 4.5 107 4.0 108 3.33333333333
If I were to just return the highest average value of 'stars',
then I could have just taken something like select max(theAvg) followed by what I just calculated.
But then, to get the highest average 'stars' associated with its 'mID', I needed something else.
So I used 'not exists' keyword followed by a subquery that generates another table of 'mID' and 'stars'. This subquery compares with the original table to verify that for some average 'stars' value from the original table R1, there exists no new table R2's average 'stars' value that is greater than R1's averaged 'stars' value
select mID, theAvg
from (select mID, avg(stars) theAvg
from Rating
group by mID) as R1
where not exists(select * from
(select mID, avg(stars) theAvg
from Rating
group by mID) as R2
where R2.theAvg > R1.theAvg);
I thought as a result of this query, I would get the highest average stars and it's mID, but instead what I get is two tuples ('mID':106, 'theAvg':4.5) and ('mID':107, 'theAvg':4.0), when the desired answer is only one tuple ('mID':106, 'theAvg':4.5), since we are looking for the highest average of all averages of 'stars'.
The result of my query(Wrong): mID theAvg 106 4.5 107 4.0 The desired Result: mID theAvg 106 4.5
What steps do you think I got wrong? Any suggestion how you'd do it?