Frosty555
asked on
Simple SQL query question (group by clause)
I have a table in SQLite that looks like this:
ID COMPUTERNAME USERNAME ACTION ACTIONDATE
========================== ========== ========== ===
1 COMPANYPC1 john login Jan 1 2014 8:06am
2 COMPANYPC1 john logout Jan 1 2014 8:24am
3 COMPANYPC2 mary login Jan 4 2014 9:15am
4 COMPANYPC3 mary login Jan 6 2014 9:24am
5 COMPANYPC2 john login Jan 7 2014 12:33pm
Basically it's a log of what user has logged in/out of what PC, at what time. It's automatically populated by various scripts and group policies on the network.
I need to determine a list of Most recently used computers
So that's easy, I do something like this:
SELECT computername, max(actiondate) as last_login_date
FROM history
WHERE action='login'
GROUP BY computername
ORDER BY last_login_date desc
That nicely gives me a list of computers, and by using the "max()" aggregate clause I can get the most recent "date" that it was logged in.
But... how can I obtain the USERNAME and ACTION?
I want something like this:
COMPUTERNAME LAST_ACTION_DATE ACTION BY_WHOM
COMPANYPC1 Jan 7 2014 12:06pm login mary
COMPANYPC2 Jan 3 2014 08:20am logout john
... etc
ID COMPUTERNAME USERNAME ACTION ACTIONDATE
==========================
1 COMPANYPC1 john login Jan 1 2014 8:06am
2 COMPANYPC1 john logout Jan 1 2014 8:24am
3 COMPANYPC2 mary login Jan 4 2014 9:15am
4 COMPANYPC3 mary login Jan 6 2014 9:24am
5 COMPANYPC2 john login Jan 7 2014 12:33pm
Basically it's a log of what user has logged in/out of what PC, at what time. It's automatically populated by various scripts and group policies on the network.
I need to determine a list of Most recently used computers
So that's easy, I do something like this:
SELECT computername, max(actiondate) as last_login_date
FROM history
WHERE action='login'
GROUP BY computername
ORDER BY last_login_date desc
That nicely gives me a list of computers, and by using the "max()" aggregate clause I can get the most recent "date" that it was logged in.
But... how can I obtain the USERNAME and ACTION?
I want something like this:
COMPUTERNAME LAST_ACTION_DATE ACTION BY_WHOM
COMPANYPC1 Jan 7 2014 12:06pm login mary
COMPANYPC2 Jan 3 2014 08:20am logout john
... etc
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
The above answer is correct, and if you want to read more, here's an article on SQL Server GROUP BY Solutions that would be an excellent read for you.
No, that answer's not correct.
SELECT computername, ACTIONDATE AS last_login_date, ACTION, BY_WHOM
FROM (
SELECT computername, ACTIONDATE, ACTION, BY_WHOM,
ROW_NUMBER() OVER(PARTITION BY computername ORDER BY actiondate DESC) AS row_num
FROM history
WHERE action='login'
GROUP BY computername
) AS derived
WHERE
row_num = 1
--ORDER BY computername
SELECT computername, ACTIONDATE AS last_login_date, ACTION, BY_WHOM
FROM (
SELECT computername, ACTIONDATE, ACTION, BY_WHOM,
ROW_NUMBER() OVER(PARTITION BY computername ORDER BY actiondate DESC) AS row_num
FROM history
WHERE action='login'
GROUP BY computername
) AS derived
WHERE
row_num = 1
--ORDER BY computername
ASKER
When I use Randy's query above, the resulting recordset lists the same computer in the "computername" field many times in many different records, rather than just showing a single record for each computer.
It seems like I'm getting every permutation of computername/username/acti on in the recordset, but waht I'm looking for is just a single record for each computer, with the latest action date, and the user / action for that latest record.
It seems like I'm getting every permutation of computername/username/acti
CORRECTION: I forgot to remove the "GROUP BY" from my query:
SELECT computername, ACTIONDATE AS last_login_date, ACTION, BY_WHOM
FROM (
SELECT computername, ACTIONDATE, ACTION, BY_WHOM,
ROW_NUMBER() OVER(PARTITION BY computername ORDER BY actiondate DESC) AS row_num
FROM history
WHERE action='login'
--GROUP BY computername --remove this line
) AS derived
WHERE
row_num = 1
ORDER BY computername
SELECT computername, ACTIONDATE AS last_login_date, ACTION, BY_WHOM
FROM (
SELECT computername, ACTIONDATE, ACTION, BY_WHOM,
ROW_NUMBER() OVER(PARTITION BY computername ORDER BY actiondate DESC) AS row_num
FROM history
WHERE action='login'
--GROUP BY computername --remove this line
) AS derived
WHERE
row_num = 1
ORDER BY computername
ASKER
This seems to work, if I remove the other GROUP BY clauses. Is this valid?
SELECT computername, max(actiondate) as last_login_date, ACTION, BY_WHOM
FROM history
WHERE action='login'
GROUP BY computername
ORDER BY last_login_date desc
I don't see how that could run in SQL Server, as "BY_WHOM" is not valid in the SELECT with that GROUP BY (column "ACTION" is not valid either, but you can get around that in this case by hard-coding 'login' as the "ACTION" column in the SELECT list).
Think I know what you are looking for, now this does assume that ID is an identity field and that ActionDate is inserted in order ie (ID 5 will have a later date/time then 4)
SELECT computername COMPUTERNAME, ActionDate LAST_ACTION_DATE,Action ACTION,Username BY_WHOM from history where id in (Select max(ID) from history group by computername) order by 2
>>I want something like this:
COMPUTERNAME LAST_ACTION_DATE ACTION BY_WHOM
COMPANYPC1 Jan 7 2014 12:06pm login mary
COMPANYPC2 Jan 3 2014 08:20am logout john<<
Those dates and times don't match anything in the original sample data, so it's difficult to see how a username gets connected to them. Also, this shows an action of logout when the max date and time is based only on login actions. I believe ScottPletcher is on the right track, but it might help to provide some realistic sample data and what you expect to see as a result from that data.
COMPUTERNAME LAST_ACTION_DATE ACTION BY_WHOM
COMPANYPC1 Jan 7 2014 12:06pm login mary
COMPANYPC2 Jan 3 2014 08:20am logout john<<
Those dates and times don't match anything in the original sample data, so it's difficult to see how a username gets connected to them. Also, this shows an action of logout when the max date and time is based only on login actions. I believe ScottPletcher is on the right track, but it might help to provide some realistic sample data and what you expect to see as a result from that data.
ASKER
Yeah I can see how my fabricated data and the altered column names are making this very confusing. Here is some real data:
Here's my current real query:
And basically what I'm looking for is to get a list of all the computers, and each record showing the most recent activity that happened for that computer, e.g.
From the looks of it I THINK that the query I posted above is actually working. But I don't know if that's just a fluke and I should really be using a nested query like what Randy and Scott are suggesting
id username computername action actiondate
------------------------------------------------------------------------------------
333 mary COMPANYPC21 login 2014-05-18 13:26:35
332 johndoe COMPANYPC11 login 2014-05-18 13:25:15
331 neil COMPANYPC03 login 2014-05-18 12:58:19
330 peter COMPANYPC16 logout 2014-05-18 12:55:13
329 neil COMPANYPC03 logout 2014-05-18 12:45:52
328 neil COMPANYPC03 login 2014-05-18 11:18:53
327 george COMPANYPC08 login 2014-05-18 11:13:26
326 george COMPANYPC08 logout 2014-05-18 11:09:14
325 george COMPANYPC03 logout 2014-05-18 11:04:26
324 derek COMPANYPC18 login 2014-05-18 11:04:17
323 derek COMPANYPC18 logout 2014-05-18 11:01:29
322 julie COMPANYPC24 logout 2014-05-18 10:55:54
321 julie COMPANYPC24 login 2014-05-18 10:26:58
320 julie COMPANYPC24 logout 2014-05-18 10:21:33
319 julie COMPANYPC24 login 2014-05-18 10:10:05
318 julie COMPANYPC24 logout 2014-05-18 09:50:14
317 jim COMPANYPC17 login 2014-05-18 09:39:46
316 peter COMPANYPC16 login 2014-05-18 09:32:18
315 jerry COMPANYPC13 login 2014-05-18 09:15:49
314 mary COMPANYPC21 login 2014-05-18 09:12:14
313 johndoe COMPANYPC30 login 2014-05-18 09:11:21
Here's my current real query:
SELECT
computername,
datetime(max(actiondate), 'localtime') as last_action_date,
action as action,
username as by_whom
FROM history
WHERE actiondate > julianday(datetime('now'))-365
GROUP BY computername
ORDER BY computername
LIMIT 100
And basically what I'm looking for is to get a list of all the computers, and each record showing the most recent activity that happened for that computer, e.g.
(this is fabricated)
computername last_action_date action by_whom
--------------------------------------------------------------------
COMPANYPC03 2014-05-18 12:58:19 login neil
COMPANYPC08 2014-05-18 11:13:26 login george
COMPANYPC11 2014-05-18 13:25:15 login johndoe
COMPANYPC16 2014-05-18 12:55:13 logout peter
COMPANYPC18 2014-05-18 11:04:17 login derek
COMPANYPC21 2014-05-18 13:26:35 login mary
...etc
From the looks of it I THINK that the query I posted above is actually working. But I don't know if that's just a fluke and I should really be using a nested query like what Randy and Scott are suggesting
Not sure why your query works given that the use of aggregate functions (e.g. max) requires a group by clause with all non-aggregate columns in the select statement. In this case your query should have grouped by computername, action, and username and not just computername. I'm not sure why the list of what you're looking for doesn't include jerry, jim, or julie although I suspect they are part of the ... etc. since you indicated the list as being fabricated. At any rate, I think the following might get you what you want:
select computername, action_date as last_action_date, action, by_whom from
(select computername, action_date, action, username as by_whom,
row_number() over (partition by computername order by action_date desc) rn
from history
where action_date > julianday(datetime('now')) -365
) as x
where x.rn = 1
order by computername;
select computername, action_date as last_action_date, action, by_whom from
(select computername, action_date, action, username as by_whom,
row_number() over (partition by computername order by action_date desc) rn
from history
where action_date > julianday(datetime('now'))
) as x
where x.rn = 1
order by computername;
ASKER
Not sure why your query works given that the use of aggregate functions (e.g. max) requires a group by clause with all non-aggregate columns in the select statement.That's what I'm wondering too. The inclusion of the "action" and "username" fields seems ambiguous. If it is grouping by the "computername" field only, how does it know which record in the group it should use for the values of the "action" and "username" fields? Does it somehow intelligently figure it out because the max() aggregate function being used for action_date? Or is it just a fluke and it's returning the first record available, whether or not it makes logical sense to do so?
I'm having trouble with the query you posted, though. I'm getting a "Near "(": syntax error(1)"
It looks like it doesn't like the nested query. Are you sure sub-queries in the FROM clause are supported in SQlite? All the examples I've found involve using a subquery in the WHERE clause, not the FROM clause
Here's my full query:
select
computername, datetime(max(actiondate), 'localtime') as last_action_date, action, username as by_whom
from
( select
computername, action_date, action, username,
row_number() over (partition by computername order by action_date desc) rn
from history
where action_date > julianday(datetime('now'))-365
) as x
where x.rn = 1
order by computername;
I'm not familiar with sqlite and perhaps the problem is that it doesn't recognize the row_number() function. What does the following query produce?
select computername, action_date, row_number() over (partition by computername order by action_date) as rn from history
select computername, action_date, row_number() over (partition by computername order by action_date) as rn from history