t-sql aggregate problem

Howard Bash
Howard Bash used Ask the Experts™
on
I have a table with a date and a unique user id.  I would like to generate a resultset that gives me the count of user table inserts per hour.  Here is desired output:
DayOfWeek  |  Date  |  HourInterval  |  UserCount
Monday | 03/05/2009 |  1  |  12
Monday | 03/05/2009 |  2  |  32
Monday | 03/05/2009 |  3  |  19
Monday | 03/05/2009 |  4  |  72
Monday | 03/05/2009 |  5  |  4
Monday | 03/05/2009 |  6  |  11
...



select 
  datename(dw,T.MY_DATETIME) as DayOfTheWeek,
  T.MY_DATETIME,
  datepart(hh,T.ACTLOG_DATETIME) as HourInterval
  count(T.USER_IDENTIFIER) as UserCount

from log_hist T

Where
  T.MY_DATETIME>convert(datetime,'10/22/2009') AND   T.MY_DATETIME<convert(datetime, '11/30/2009')
group by
  T.MY_DATETIME,
  datename(dw,T.MY_DATETIME), 
  datepart(hh,T.ACTLOG_DATETIME)

Open in new window

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
If you're in SQL 2008 try

 
select 
  datename(dw,T.MY_DATETIME) as DayOfTheWeek,
  cast(T.MY_DATETIME as date) as myDate,
  datepart(hh,T.ACTLOG_DATETIME) as HourInterval
  count(T.USER_IDENTIFIER) as UserCount

from log_hist T

Where
  T.MY_DATETIME>convert(datetime,'10/22/2009') AND   T.MY_DATETIME<convert(datetime, '11/30/2009')
group by
  cast(T.MY_DATETIME as date),
  datename(dw,T.MY_DATETIME), 
  datepart(hh,T.ACTLOG_DATETIME)

Open in new window


If in SQL 2005

 
select 
  datename(dw,T.MY_DATETIME) as DayOfTheWeek,
  dateadd(d, dateadiff(d, 0, T.MY_DATETIME), 0) as myDate,
  datepart(hh,T.ACTLOG_DATETIME) as HourInterval
  count(T.USER_IDENTIFIER) as UserCount

from log_hist T

Where
  T.MY_DATETIME>convert(datetime,'10/22/2009') AND   T.MY_DATETIME<convert(datetime, '11/30/2009')
group by
  dateadd(d, dateadiff(d, 0, T.MY_DATETIME), 0),
  datename(dw,T.MY_DATETIME), 
  datepart(hh,T.ACTLOG_DATETIME)

Open in new window

Howard BashSenior Software Engineer

Author

Commented:
Both queries seem "close" but the myDate value has lost the valid time and is showing 12:00.
because you need to get rid of the time portion so that the grouping can be done by date only

if you don't want the 12:00 to appear do

...convert(varchar, dateadd(d, dateadiff(d, 0, T.MY_DATETIME), 0), 103) as myDate, ....

.....
group by convert(varchar, dateadd(d, dateadiff(d, 0, T.MY_DATETIME), 0), 103) ....
Acronis in Gartner 2019 MQ for datacenter backup

It is an honor to be featured in Gartner 2019 Magic Quadrant for Datacenter Backup and Recovery Solutions. Gartner’s MQ sets a high standard and earning a place on their grid is a great affirmation that Acronis is delivering on our mission to protect all data, apps, and systems.

Howard BashSenior Software Engineer

Author

Commented:
Honestly i would like to show the date and time as is in the database field.
Maybe this is what you're looking for then

select 
  datename(dw,T.MY_DATETIME) as DayOfTheWeek,
  T.MY_DATETIME,
  datepart(hh,T.ACTLOG_DATETIME) as HourInterval,
  count(T.USER_IDENTIFIER) over (partition by   cast(T.MY_DATETIME as date)) as UserCount

from log_hist T

Where
  T.MY_DATETIME>convert(datetime,'10/22/2009') AND   T.MY_DATETIME<convert(datetime, '11/30/2009')

Open in new window

sorry like this
select 
  datename(dw,T.MY_DATETIME) as DayOfTheWeek,
  T.MY_DATETIME,
  datepart(hh,T.ACTLOG_DATETIME) as HourInterval,
  count(T.USER_IDENTIFIER) over (partition by   dateadd(hh, datediff(hh, 0, T.MY_DATETIME as date), 0)) as UserCount

from log_hist T

Where
  T.MY_DATETIME>convert(datetime,'10/22/2009') AND   T.MY_DATETIME<convert(datetime, '11/30/2009')

Open in new window

and btw, you don't need the convert there in your where clause
select 
  datename(dw,T.MY_DATETIME) as DayOfTheWeek,
  T.MY_DATETIME,
  datepart(hh,T.ACTLOG_DATETIME) as HourInterval,
  count(T.USER_IDENTIFIER) over (partition by   dateadd(hh, datediff(hh, 0, T.MY_DATETIME as date), 0)) as UserCount

from log_hist T

Where
  T.MY_DATETIME> '10/22/2009' AND T.MY_DATETIME < '11/30/2009'

Open in new window

Howard BashSenior Software Engineer

Author

Commented:
Get's error:
Incorrect syntax near the keyword 'as'.
pointing to count(...) line.

also,  the user_identifier counts need to be unique within the "hour" interval being reported on.
can you possible modify the three flavors of queries with this uniqueness and let me know about the error listed here?

Thanks much.  It really is close (I have a few questions though about the dateadd stuff but would rather look at the final solution first).

Again, THANKS!
oops, yeah line 5 above should be

count(T.USER_IDENTIFIER) over (partition by   dateadd(hh, datediff(hh, 0, T.MY_DATETIME), 0)) as UserCount

try it and let me know

I'm just using dateadd ... and datediff to group all the dates to the hour.
Howard BashSenior Software Engineer

Author

Commented:
It now runs, but the resultset is off.  There is no grouping though so ...
>>There is no grouping though so ...<<
Yes, because you cannot have both. If you want the grouping you need to decide which date to show, which in my opinion it makes sense my comment http:#a35779704  with the adjustment in http:#a35780266. Remember that the hour is shown with the HoursInterval column. So, I'm not sure why you're saying that this is not what you expect.

So basically if your data looks like this:

DayOfWeek  |  Date  |  
Monday | 03/05/2009  1:01
Monday | 03/05/2009  1:21
Monday | 03/05/2009  1:45
Monday | 03/05/2009  2:03
Monday | 03/05/2009  3:15
Monday | 03/05/2009  4:02


and you want

DayOfWeek  |  Date  |  HourInterval  |  UserCount
Monday | 03/05/2009 |  1  |  3
Monday | 03/05/2009 |  2  |  1
Monday | 03/05/2009 |  3  |  1
Monday | 03/05/2009 |  4  |  1

you definitively need to drop the time portion in the "date" column. otherwise there's nothing to group and you will end up with the same output as your input.
Please note that when I say drop the time portion, I mean it for display purpose only, not from your table. The original data remains the same, it's just the select query adjusting it so you get the correct count. Here's the query again. Try it and let me know.

select 
  datename(dw,T.MY_DATETIME) as DayOfTheWeek,
  convert(varchar, dateadd(d, dateadiff(d, 0, T.MY_DATETIME), 0), 103) as myDate,
  datepart(hh,T.ACTLOG_DATETIME) as HourInterval
  count(T.USER_IDENTIFIER) as UserCount

from log_hist T

Where
  T.MY_DATETIME>convert(datetime,'10/22/2009') AND   T.MY_DATETIME<convert(datetime, '11/30/2009')
group by
  convert(varchar, dateadd(d, dateadiff(d, 0, T.MY_DATETIME), 0), 103),
  datename(dw,T.MY_DATETIME), 
  datepart(hh,T.ACTLOG_DATETIME)

Open in new window

one more thing, I guess in your example t.My_Datetime and t.ACTLOG_DATETIME are the same column, right?
Howard BashSenior Software Engineer

Author

Commented:
yes
ok, try my last comment http:#a35781088 and let me know.
Howard BashSenior Software Engineer

Author

Commented:
It runs now.  I am not sure if the count of users is unique per hour though.

The MyDate field :   convert(varchar, dateadd(d, datediff(d, 0, T.ACTLOG_DATETIME), 0), 103) as myDate, gives me the date in the format dd/mm/yyyy.  Can you tell me the value to get the format mm/dd/yyyy?
Howard BashSenior Software Engineer

Author

Commented:
Ok. sorry, the format I needed for the dates is "101".    But,  I do think that the user id values per hour are more than once.  The last piece of this is to only have one occurrence per hour per user in the usercount hour buckets.
Howard BashSenior Software Engineer

Author

Commented:
The users must repeat in this query as there are some rows with counts higher than the total number of users.
I'm not following you, Could you post some sample data (input) and the expected result?
Howard BashSenior Software Engineer

Author

Commented:
The table is an activity table for a web application.  Every time a user goes to a page the access is logged with user id,  datetime of access (multiple inserts per user per hour).  I am trying to generate a resultset that represents how many users accessed the web site per hour.  So,  since all the user's accesses are written to this activity table,  the user id will appear several times in a one hour window in this logging table.

The resultset should answer the question: "How many users access this web site per hour hour (not how many times per user which is what we have)"  for each hour of the day for a date range.


It's kind of like we want the hourly bucket to contain the results of a select "distinct" user_identifier count.
Howard BashSenior Software Engineer

Author

Commented:
So,  if user1 access the web site 5 times at 9:00pm today and user34 access the web site 12 times within the same 9:00PM "bucket" the output would be (assuming just these two accessed the site):

DayOfWeek  |  Date  |  HourInterval  |  UserCount
Tuesday | 05/17/2011 |  21  |  2


Howard BashSenior Software Engineer

Author

Commented:
DayOfWeek  |  Date  |  HourInterval  |  UserCount
Tuesday | 05/17/2011 |  21  |  2

and then another 3 users access the site at 10:15PM

DayOfWeek  |  Date  |  HourInterval  |  UserCount
Tuesday | 05/17/2011 |  21  |  2
Tuesday | 05/17/2011 |  22  |  3



Howard BashSenior Software Engineer

Author

Commented:
What you showed for input and output could be correct in post: ID: 35781040
So, if each access was by a different users for entries within the "same hour", the output is what I am trying to obtain.

Alternately, if every access was by one user your data would look like this:

DayOfWeek  |  Date  |  
Monday | 03/05/2009  1:01
Monday | 03/05/2009  2:03
Monday | 03/05/2009  3:15
Monday | 03/05/2009  4:02

(note that the 1:00 basket only has 1 user and so that is it's only row above)

yielding:

DayOfWeek  |  Date  |  HourInterval  |  UserCount
Monday | 03/05/2009 |  1  |  1
Monday | 03/05/2009 |  2  |  1
Monday | 03/05/2009 |  3  |  1
Monday | 03/05/2009 |  4  |  1

Then I think you just need to add DISTINCT in the count there. check the below:


select 
  datename(dw,T.MY_DATETIME) as DayOfTheWeek,
  convert(varchar, dateadd(d, dateadiff(d, 0, T.MY_DATETIME), 0), 103) as myDate,
  datepart(hh,T.ACTLOG_DATETIME) as HourInterval,
  count(distinct T.USER_IDENTIFIER) as UserCount

from log_hist T

Where
  T.MY_DATETIME>'10/22/2009' AND T.MY_DATETIME< '11/30/2009'
group by
  convert(varchar, dateadd(d, dateadiff(d, 0, T.MY_DATETIME), 0), 103),
  datename(dw,T.MY_DATETIME), 
  datepart(hh,T.ACTLOG_DATETIME)

Open in new window

Howard BashSenior Software Engineer

Author

Commented:
Just right.  I bow to the master.  If you're ever in NYC...
A minor note,  you have a typo:  the function datediff you typed as dateadiff (an extra a).
Howard BashSenior Software Engineer

Author

Commented:
An extremely knowledgeable and patient SQL master.
Glad to help!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial