[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

group by all the columns in where clause?

Posted on 2005-04-09
10
Medium Priority
?
949 Views
Last Modified: 2007-12-19
Hi,

Is this a valid syntax? using group by.. do we need to group by on all the columns we use after "WHERE" ?

If this is not a valid, please help me remake.

>>Where operatorid, dtstart and etc<<

table called inout <--------------------------time records of login/out of employee
columns:
operatorid
timein
timeout

table called #tempshift <---------------------------daily shift of employee
columns
scheduleid
operatorid
dtstart
DTfirstbreakout


SELECT scheduleid, sum(round(datediff(n,r.timein,r.timeout)/60.00,2)) AS exceptiontimeout
INTO #tempexception  
 FROM inout AS r, #tempshift AS s  
 WHERE r.operatorid in (select distinct operatorid from  #tempshift)
And r.timein Between dateadd(n,-10,s.dtstart) And dateadd(n,30,s.dtend)
And r.timeout Between dateadd(n,-30,s.dtstart) And dateadd(n,15,s.dtend)
And s.dtstart Between @shiftdate and (@shiftdate + cast('23:59:59' as datetime)) and excuse <> ' '
 GROUP BY scheduleid
0
Comment
Question by:lynnton
  • 6
  • 4
10 Comments
 
LVL 9

Expert Comment

by:solution46
ID: 13742825
lynnton,

just from preference and mainly to make things easier to read, I would suggest breaking it apart a bit. Assuming you are doing this in a sproc...

DECLARE @minbreakinstart int, @minbreakinstop int

SELECT @minbreakstart = parametervalue from systemparameter where parametername='minbreakinstart'
SELECT @minbreakstop = parametervalue from systemparameter where parametername='minbreakinstop'

SELECT  
    scheduleid,
    Min(r.timein) AS firstbreakin
INTO #tempfirstbin  
FROM inout AS r, #tempshift AS s  
WHERE r.operatorid=s.operatorid
And r.timein Between
dateadd(n, @minbreakinstart, s.DTfirstbreakout)
And dateadd(n, @minbreakinstop), DTfirstbreakout)
And s.DTstart Between @shiftdate and (@shiftdate + cast('23:59:59' as datetime))  

GROUP BY scheduleid


This should work; can you try it and post the result please...

s46.
0
 
LVL 1

Author Comment

by:lynnton
ID: 13742835
solution46,

I've change it abit before you posted,

WHERE r.operatorid in (select distinct operatorid from  #tempshift)

will this speed up the process?

Thanks.
0
 
LVL 9

Accepted Solution

by:
solution46 earned 2000 total points
ID: 13742847
With that part of the script, you'd be better off trying a join...

SELECT  
    scheduleid,
    Min(r.timein) AS firstbreakin
INTO #tempfirstbin  
FROM inout r,
    INNER JOIN #tempshift s
        ON r.operatorid = s.operatorid
WHERE r.timein Between
dateadd(n, @minbreakinstart, s.DTfirstbreakout)
And dateadd(n, @minbreakinstop), DTfirstbreakout)
And s.DTstart Between @shiftdate and (@shiftdate + cast('23:59:59' as datetime))  

GROUP BY scheduleid

s46
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 1

Author Comment

by:lynnton
ID: 13743117
solution46,

Is this a valid group by? i really just wanted to group by  s.scheduleid. unfortunately inorder to query the other columns sql server tells me to add them to group by...is this the proper way?



Thanks.

GROUP BY s.scheduleid,s.operatorid , s.teamleader , s.operatorname , s.designation,s.dtstart,s.dtend  



SELECT s.operatorid AS operatorid,
s.teamleader as teamleader,
s.operatorname as name,
s.designation as designation,
s.scheduleid, min(r.timein) AS DTstart,
max(r.timeout) AS DTstop,
round(datediff(n,s.dtstart,min(r.timein))/60.00,2) as TotalLate ,
round(datediff(n,max(r.timeout),s.dtend)/60.00,2) as TotalLeaveEarly,
s.dtstart as ShiftStart,s.dtend as ShiftEnd  

INTO #tempinout  
FROM inout AS r
INNER JOIN #tempshift s ON r.operatorid = s.operatorid
WHERE r.timein Between dateadd(n,-9,s.dtstart) And s.dtend
And r.timeout Between dateadd(n,-9,s.dtstart) And dateadd(n,15,s.dtend)
And s.dtstart Between @shiftdate and @shiftdatestop  
GROUP BY s.scheduleid,s.operatorid , s.teamleader , s.operatorname , s.designation,s.dtstart,s.dtend  
0
 
LVL 9

Expert Comment

by:solution46
ID: 13743134
lynnton,

yes, this is correct - I didn't realise about the other columns in the SELECT clause.

The rules of GROUPing are fairly straightforward. Any column that is referenced in the SELECT or ORDER BY clauses must either be aggregated or GROUPed, like this...

SELECT
    Field1,
    Max(Field2)
From MyTable
WHERE Field3 = Value
GROUP BY Field1, Field4
ORDER BY Field4

You can use another clause, HAVING, to filter on the aggregate expressions (it can filter on any expression but it is more efficient to use WHERE to filter non aggregat exressions). e.g. you could add the following line...

HAVING Max(Field2) > 10

What this does is...

SELECT
    Field1,
this bit selects all the values of Field1
   
    Max(Field2)
this selects the maximum value of Field2 for each group (to be defined). At this point it is obvious that we will need to GROUP BY something because otherwise this Max() function has no meaning.

From MyTable
specifies the source data. You can include as many joins as you like, referencing as many fields as you like. None of these need to be included in the GROUP BY clause because this is building up the data that GROUP BY will aggregate.

WHERE Field3 = Value
applies a filter to the source data, BEFORE it is aggregated. Consequently, fields ONLY appearing in the WHERE clause do not need to be GROUPed.

GROUP BY Field1, Field4
at this point, we need to see what fields we want to GROUP BY. First off, look at the SELECT and ORDER BY clauses. Any fields listed there that have not been aggregated need to be included in the GROUP BY clause. The GROUP BY clause will take the base data defined in the SELECT ... FROM ... WHERE part and group by the listed fields.

ORDER BY Field4
The ORDER BY is a modifier applied to the aggregated data in order to sort it. Consequently, and field added into the ORDER BY clause is implicitly SELECTed; SQL Server sees that field as though it is in the SELECT clause.


Hope this helps,

s46.
0
 
LVL 9

Expert Comment

by:solution46
ID: 13743138
er, forgot the HAVING bit...

this is a modifer appliad AFTER the aggregation. In this case, it looks at the maximum values of Field2 and filters out any rows having a vlaue <= 10. This filter could easily be applied at the WHERE clause level; HAVING really becomes useful for doing things like this...

HAVING Max(FieldA) + Max(FieldB) > 10. The WHERE clause could not filter this out as the two maxima may well be on separate rows.


s46.
0
 
LVL 1

Author Comment

by:lynnton
ID: 13743167
solution46,

Would the below query affect results?

Cause sql server is telling me to add those other columns to group by, if it will affect, maybe we can do something? can we jion table again then add those columns that we need?
 
Thanks.

SELECT s.operatorid AS operatorid,  <-----------------------------ugly group by, query1
s.teamleader as teamleader,
s.operatorname as name,
s.designation as designation,
s.scheduleid, min(r.timein) AS DTstart,
max(r.timeout) AS DTstop,
round(datediff(n,s.dtstart,min(r.timein))/60.00,2) as TotalLate ,
round(datediff(n,max(r.timeout),s.dtend)/60.00,2) as TotalLeaveEarly,
s.dtstart as ShiftStart,s.dtend as ShiftEnd  

INTO #tempinout  
FROM inout AS r
INNER JOIN #tempshift s ON r.operatorid = s.operatorid
WHERE r.timein Between dateadd(n,-9,s.dtstart) And s.dtend
And r.timeout Between dateadd(n,-9,s.dtstart) And dateadd(n,15,s.dtend)
And s.dtstart Between @shiftdate and @shiftdatestop  
GROUP BY s.scheduleid,s.operatorid , s.teamleader , s.operatorname , s.designation,s.dtstart,s.dtend  


VS


SELECT  s.scheduleid, <-----------------------------this is the only group by we wanted, query 2
min(r.timein) AS DTstart,
max(r.timeout) AS DTstop,
round(datediff(n,s.dtstart,min(r.timein))/60.00,2) as TotalLate ,
round(datediff(n,max(r.timeout),s.dtend)/60.00,2) as TotalLeaveEarly,
s.dtstart as ShiftStart,s.dtend as ShiftEnd  

INTO #tempinout  
FROM inout AS r
INNER JOIN #tempshift s ON r.operatorid = s.operatorid
WHERE r.timein Between dateadd(n,-9,s.dtstart) And s.dtend
And r.timeout Between dateadd(n,-9,s.dtstart) And dateadd(n,15,s.dtend)
And s.dtstart Between @shiftdate and @shiftdatestop  
GROUP BY s.scheduleid  
0
 
LVL 9

Expert Comment

by:solution46
ID: 13743202
lynnton,

the first query is pefectly acceptable. All the aggregation is doing is showing every unique combination of s.scheduleid,s.operatorid , s.teamleader , s.operatorname , s.designation,s.dtstart,s.dtend and showing the maximum value of r.timeout for these rows. This is a pretty normal way of doing things.

the second query is probably going to fail as you'll still need to GROUP BY dtstart and dtend as well as s.scheduleid.

Sometimes long-winded GROUP BY clauses are unavoidable, particularly if you are including lots of fields from the same table.

s46
0
 
LVL 1

Author Comment

by:lynnton
ID: 13743375
solution46,

Thanks so muc hfor your time and patience
0
 
LVL 9

Expert Comment

by:solution46
ID: 13743407
no probs, good luck with the DB.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recently we ran in to an issue while running some SQL jobs where we were trying to process the cubes.  We got an error saying failure stating 'NT SERVICE\SQLSERVERAGENT does not have access to Analysis Services. So this is a way to automate that wit…
This shares a stored procedure to retrieve permissions for a given user on the current database or across all databases on a server.
Via a live example, show how to shrink a transaction log file down to a reasonable size.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.
Suggested Courses

873 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question