ugeb
asked on
How to structure query with count aggregate
I'm working on a challenge on HackersRank, and brushing up on my SQL skills. I'm given the following table 'Challenges':
So the challenge_id is unique in the table, and hacker_id is the id of the hacker who created the challenge. I'm trying to meet the challenge step by step, so right now I'm working on what will be an inner query.
What I'm trying to figure out is how to get a list of hacker_id's where there is no other hacker that created that exact number of challenges.
Next, I'm trying to get a list of EITHER all hackers who created a unique number of challenges or who DIDN'T create a unique number of challenges. Either way works. Playing around, I tried this:
Why can't I group on cnt1?
I want the query to either include or exclude hackers without a unique number of challenges.
Without changing the approach I'm taking, how can I accomplish this query?
So the challenge_id is unique in the table, and hacker_id is the id of the hacker who created the challenge. I'm trying to meet the challenge step by step, so right now I'm working on what will be an inner query.
What I'm trying to figure out is how to get a list of hacker_id's where there is no other hacker that created that exact number of challenges.
Select c2.hacker_id, Count(c2.challenge_id) cnt2
From Challenges as c2
Group By c2.hacker_id
Order By cnt2 Desc;
This query works in getting the number of challenges created by each hacker. I don't really care about the order in this case, I just have the Order By for viewing intermediate results.Next, I'm trying to get a list of EITHER all hackers who created a unique number of challenges or who DIDN'T create a unique number of challenges. Either way works. Playing around, I tried this:
Select *, count(ch.cnt2) cnt1 From
(Select c2.hacker_id, Count(c2.challenge_id) cnt2
From Challenges as c2
Group By c2.hacker_id) ch,
Challenges c1
Group By cnt1;
I received the error:
ERROR 1056 (42000) at line 1: Can't group on 'cnt1'.
Why can't I group on cnt1?
I want the query to either include or exclude hackers without a unique number of challenges.
Without changing the approach I'm taking, how can I accomplish this query?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you, but maybe I wasn't clear (or maybe I don't understand your query). If I'm reading your query correctly, it's returning the set of hackers who created exactly 1 challenge. But that's not what I meant by unique. I want the set of hackers who created a unique number of challenges, not just one challenge.
Let's say the challenges table has entries like:
Challenge_id, hacker_id
7 99
8 99
11 99
12 77
14 77
15 77
17 44
18 44
19 51
In this table, hacker 99 created 3 challenges (7, 8 and 11), hacker 77 also created 3 challenges (12,14,15), hacker 44 created 2 challenges and hacker 51 created 1 challenge. So, because hackers 99 and 77 both created 3 challenges, 3 is not unique. However, hacker 44 created 2 challenges and nobody else created exactly two challenges. So, 2 is unique. Likewise, hacker 51 was the only hacker who created exactly 1 challenge.
Is that clearer?
Let's say the challenges table has entries like:
Challenge_id, hacker_id
7 99
8 99
11 99
12 77
14 77
15 77
17 44
18 44
19 51
In this table, hacker 99 created 3 challenges (7, 8 and 11), hacker 77 also created 3 challenges (12,14,15), hacker 44 created 2 challenges and hacker 51 created 1 challenge. So, because hackers 99 and 77 both created 3 challenges, 3 is not unique. However, hacker 44 created 2 challenges and nobody else created exactly two challenges. So, 2 is unique. Likewise, hacker 51 was the only hacker who created exactly 1 challenge.
Is that clearer?
ASKER
I think we were writing at the same. Yes, this works. I just had to get rid of the hacker_id in the outer Group By clause in your query.
Thanks!
Thanks!
So to find hackers who created unique challenges could be e.g. following:
Open in new window
Here it does not matter whether you use COUNT(*) or COUNT(challenge_id).