databarracks
asked on
SQL Query Using Group By, Distinct but Return All values
Hi there,
I am a little stuck with a SQL Query of which I use to populate a pivot grid control in my application. I have attached an excel spreadsheet to illustrate what the source data looks like and what my desired query result should look like.
The issue I am facing is that I cannot use the group by statement on User, Date and Stage without only showing values where there was actually records within that specific week.
Could someone please kindly assist me with this query of which your help is most appreciated?
SQL-Query-Plot.xls
I am a little stuck with a SQL Query of which I use to populate a pivot grid control in my application. I have attached an excel spreadsheet to illustrate what the source data looks like and what my desired query result should look like.
The issue I am facing is that I cannot use the group by statement on User, Date and Stage without only showing values where there was actually records within that specific week.
SELECT dbo.MyTable.User, dbo.MyTable.Stage, dbo.MyTable.Date, SUM(dbo.MyTable.Amount) AS Amount, COUNT(DISTINCT dbo.MyTable.Id) AS Total
FROM dbo.MyTable
GROUP BY dbo.MyTable.User, dbo.MyTable.Stage, dbo.MyTable.Date
Could someone please kindly assist me with this query of which your help is most appreciated?
SQL-Query-Plot.xls
Add a where count(distinct dbo.mytable.id) >0
ASKER
Hi Arnold,
Apologies but your response doesn't make sense to me? Could you please kindly explain to me how this is going to work?
Apologies but your response doesn't make sense to me? Could you please kindly explain to me how this is going to work?
ASKER
I get "An aggregate may not apperar in the WHERE clause unliess it is in a subquery...etc" message
You are grouping by the following dbo.MyTable.User, dbo.MyTable.Stage, dbo.MyTable.Date
meaning the combination of these three then you are using the count of ID which is unique to each row.
You have the desired result, what is the actual result of your query based on the data set?
Rereading what you are looking for, you want the data that does not exist reflected in the report i.e. step 2 for Mark on the 03/17/2014 which does not exist, mark did not have step 2 on that date.
that is not how group by works, since this is a single table and you
using with cte get the range of stages
https://technet.microsoft.com/en-us/library/ms190766%28v=sql.105%29.aspx
you want to get the distinct stages available, then get the data to include them. let me think on that.....
meaning the combination of these three then you are using the count of ID which is unique to each row.
You have the desired result, what is the actual result of your query based on the data set?
Rereading what you are looking for, you want the data that does not exist reflected in the report i.e. step 2 for Mark on the 03/17/2014 which does not exist, mark did not have step 2 on that date.
that is not how group by works, since this is a single table and you
using with cte get the range of stages
https://technet.microsoft.com/en-us/library/ms190766%28v=sql.105%29.aspx
you want to get the distinct stages available, then get the data to include them. let me think on that.....
You must reference the aggregate in a HAVING clause not a where clause
AFTER the group by clause add
HAVING count(distinct dbo.mytable.id) >0
AFTER the group by clause add
HAVING count(distinct dbo.mytable.id) >0
ASKER
Hi Paul and Arnold,
I added the HAVING after the group by already and it still doesn't show rows that didn't exist in a grouped date. It basically shows the same rows as my original query.
I think Arnold is right that CTE might be the correct approach possibly but not sure how this will affect performance etc and have no idea how to build the query.?
I am guessing I need to write a "CTE With" of which will obtain all the Stage names from the table first then do some kind of join..... I am completely out of my depth here apologies.
Again I really appreciate your help so far on this matter.
I added the HAVING after the group by already and it still doesn't show rows that didn't exist in a grouped date. It basically shows the same rows as my original query.
I think Arnold is right that CTE might be the correct approach possibly but not sure how this will affect performance etc and have no idea how to build the query.?
I am guessing I need to write a "CTE With" of which will obtain all the Stage names from the table first then do some kind of join..... I am completely out of my depth here apologies.
Again I really appreciate your help so far on this matter.
you would need an outer join .
https://social.technet.microsoft.com/Forums/sqlserver/en-US/27266322-8b7e-4f4c-9f85-95418fcee5ef/cte-full-outer-join-question?forum=transactsql
This might be what you need ( might be repeating unneeded.....
https://social.technet.microsoft.com/Forums/sqlserver/en-US/27266322-8b7e-4f4c-9f85-95418fcee5ef/cte-full-outer-join-question?forum=transactsql
This might be what you need ( might be repeating unneeded.....
;with cte as (select distinct stage from mytable group by stage)
select a.user,cte.stage,sum(a.amount0 as Amount,count(distinct a.id) as Count, a.date from mytable a outer join cte on cte.stage=a.stage
this result:
Produced by this query:
details:
| user | date | amount | stage |
|------|------------|--------|--------|
| MARK | 17/03/2015 | 60 | Step 1 |
| MARK | 17/03/2015 | 0 | Step 2 |
| MARK | 17/03/2015 | 40 | Step 3 |
| MARK | 17/03/2015 | 0 | Step 4 |
| MARK | 24/03/2015 | 0 | Step 4 |
| MARK | 24/03/2015 | 5 | Step 3 |
| MARK | 24/03/2015 | 10 | Step 2 |
| MARK | 24/03/2015 | 25 | Step 1 |
| MARK | 31/03/2015 | 10 | Step 1 |
| MARK | 31/03/2015 | 5 | Step 2 |
| MARK | 31/03/2015 | 0 | Step 3 |
| MARK | 31/03/2015 | 2 | Step 4 |
| JOHN | 24/03/2015 | 10 | Step 1 |
| JOHN | 24/03/2015 | 0 | Step 4 |
| JOHN | 24/03/2015 | 0 | Step 3 |
| JOHN | 24/03/2015 | 0 | Step 2 |
| JOHN | 31/03/2015 | 7 | Step 2 |
| JOHN | 31/03/2015 | 3 | Step 2 |
| JOHN | 31/03/2015 | 1 | Step 3 |
| JOHN | 31/03/2015 | 0 | Step 4 |
| JOHN | 31/03/2015 | 9 | Step 1 |
Produced by this query:
select t.[user], t.date, coalesce(m.amount,0) as amount, t.stage
from (select *
from (select distinct [user], date from mytable) ud
cross join (select distinct stage from mytable) s
) t
left join mytable m on t.[user] = m.[user] and t.date = m.date and t.stage = m.stage
order by [user] DESC, date
;
details:
CREATE TABLE MyTable
([ID] int, [USER] varchar(4), [STAGE] varchar(6), [AMOUNT] int, [DATE] varchar(10))
;
INSERT INTO MyTable
([ID], [USER], [STAGE], [AMOUNT], [DATE])
VALUES
(123, 'MARK', 'Step 1', 60, '17/03/2015'),
(124, 'MARK', 'Step 3', 40, '17/03/2015'),
(125, 'MARK', 'Step 1', 25, '24/03/2015'),
(126, 'MARK', 'Step 2', 10, '24/03/2015'),
(127, 'MARK', 'Step 3', 5, '24/03/2015'),
(128, 'MARK', 'Step 2', 5, '31/03/2015'),
(129, 'MARK', 'Step 1', 10, '31/03/2015'),
(130, 'MARK', 'Step 4', 2, '31/03/2015'),
(131, 'JOHN', 'Step 1', 10, '24/03/2015'),
(132, 'JOHN', 'Step 1', 9, '31/03/2015'),
(133, 'JOHN', 'Step 2', 7, '31/03/2015'),
(134, 'JOHN', 'Step 2', 3, '31/03/2015'),
(135, 'JOHN', 'Step 3', 1, '31/03/2015')
;
**Query 1**:
select t.[user], t.date, coalesce(m.amount,0) as amount, t.stage
from (select *
from (select distinct [user], date from mytable) ud
cross join (select distinct stage from mytable) s
) t
left join mytable m on t.[user] = m.[user] and t.date = m.date and t.stage = m.stage
order by [user] DESC, date
**[Results][2]**:
| user | date | amount | stage |
|------|------------|--------|--------|
| MARK | 17/03/2015 | 60 | Step 1 |
| MARK | 17/03/2015 | 0 | Step 2 |
| MARK | 17/03/2015 | 40 | Step 3 |
| MARK | 17/03/2015 | 0 | Step 4 |
| MARK | 24/03/2015 | 0 | Step 4 |
| MARK | 24/03/2015 | 5 | Step 3 |
| MARK | 24/03/2015 | 10 | Step 2 |
| MARK | 24/03/2015 | 25 | Step 1 |
| MARK | 31/03/2015 | 10 | Step 1 |
| MARK | 31/03/2015 | 5 | Step 2 |
| MARK | 31/03/2015 | 0 | Step 3 |
| MARK | 31/03/2015 | 2 | Step 4 |
| JOHN | 24/03/2015 | 10 | Step 1 |
| JOHN | 24/03/2015 | 0 | Step 4 |
| JOHN | 24/03/2015 | 0 | Step 3 |
| JOHN | 24/03/2015 | 0 | Step 2 |
| JOHN | 31/03/2015 | 7 | Step 2 |
| JOHN | 31/03/2015 | 3 | Step 2 |
| JOHN | 31/03/2015 | 1 | Step 3 |
| JOHN | 31/03/2015 | 0 | Step 4 |
| JOHN | 31/03/2015 | 9 | Step 1 |
[1]: http://sqlfiddle.com/#!6/33474/7
[2]: http://sqlfiddle.com/#!6/33474/7/0
ASKER
Hi Arnold,
I am still getting the same results:
Arnold - I had to add a few extra things to your query such as the group by and had to change the join type as outer join doesn't exist so used left outer join instead. The query produced results but exactly the same as before
Paul - If you look at your result set above it doesn't adhere to the desired result as per my spreadsheet as all users should have 4 entries per each date with all steps displayed?
I am still getting the same results:
Arnold - I had to add a few extra things to your query such as the group by and had to change the join type as outer join doesn't exist so used left outer join instead. The query produced results but exactly the same as before
Paul - If you look at your result set above it doesn't adhere to the desired result as per my spreadsheet as all users should have 4 entries per each date with all steps displayed?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
You have two joins one on the stage and one on the date.
Co-mingling between different DBs.
...
Is this an assignment?
Co-mingling between different DBs.
...
Is this an assignment?
ASKER
Hi Paul,
You solved it :) hooray well done on that one really appreciate your help on this. Thank you too Arnold for your help it is much appreciated P.S it wasn't an assignment we are trying to run a few queries from Salesforce for our business so was testing to see if what was required by our team was achievable.
Thanks again guys
You solved it :) hooray well done on that one really appreciate your help on this. Thank you too Arnold for your help it is much appreciated P.S it wasn't an assignment we are trying to run a few queries from Salesforce for our business so was testing to see if what was required by our team was achievable.
Thanks again guys
ASKER
Excellent stuff from Paul he was excellent and a mention to Arnold who also didn't give up on resolving this matter. Brilliant :)