Link to home
Start Free TrialLog in
Avatar of Frosty555
Frosty555Flag for Canada

asked on

Simple SQL query question (group by clause)

I have a table in SQLite that looks like this:

ID      COMPUTERNAME      USERNAME     ACTION    ACTIONDATE
=================================================
1         COMPANYPC1            john               login         Jan 1 2014 8:06am
2         COMPANYPC1            john               logout       Jan 1 2014 8:24am
3         COMPANYPC2            mary              login         Jan 4 2014 9:15am
4         COMPANYPC3            mary              login         Jan 6 2014 9:24am
5         COMPANYPC2            john               login         Jan 7 2014 12:33pm


Basically it's a log of what user has logged in/out of what PC, at what time. It's automatically populated by various scripts and group policies on the network.

I need to determine a list of Most recently used computers

So that's easy, I do something like this:

SELECT computername, max(actiondate) as last_login_date
FROM history
WHERE action='login'
GROUP BY computername
ORDER BY last_login_date desc

That nicely gives me a list of computers, and by using the "max()" aggregate clause I can get the most recent "date" that it was logged in.

But... how can I obtain the USERNAME and ACTION?  

I want something like this:

COMPUTERNAME      LAST_ACTION_DATE            ACTION         BY_WHOM
COMPANYPC1            Jan 7 2014 12:06pm           login              mary
COMPANYPC2            Jan 3 2014 08:20am            logout           john

... etc
ASKER CERTIFIED SOLUTION
Avatar of Randy Poole
Randy Poole
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Jim Horn
The above answer is correct, and if you want to read more, here's an article on SQL Server GROUP BY Solutions that would be an excellent read for you.
No, that answer's not correct.

SELECT computername, ACTIONDATE AS last_login_date, ACTION, BY_WHOM
FROM (
    SELECT computername, ACTIONDATE, ACTION, BY_WHOM,
        ROW_NUMBER() OVER(PARTITION BY computername ORDER BY actiondate DESC) AS row_num
    FROM history
    WHERE action='login'
    GROUP BY computername
) AS derived
WHERE
    row_num = 1
--ORDER BY computername
Avatar of Frosty555

ASKER

When I use Randy's query above, the resulting recordset lists the same computer in the "computername" field many times in many different records, rather than just showing a single record for each computer.

It seems like I'm getting every permutation of computername/username/action in the recordset, but waht I'm looking for is just a single record for each computer, with the latest action date, and the user / action for that latest record.
CORRECTION: I forgot to remove the "GROUP BY" from my query:

SELECT computername, ACTIONDATE AS last_login_date, ACTION, BY_WHOM
 FROM (
     SELECT computername, ACTIONDATE, ACTION, BY_WHOM,
         ROW_NUMBER() OVER(PARTITION BY computername ORDER BY actiondate DESC) AS row_num
     FROM history
     WHERE action='login'
     --GROUP BY computername  --remove this line
) AS derived
 WHERE
     row_num = 1
ORDER BY computername
This seems to work, if I remove the other GROUP BY clauses. Is this valid?

SELECT computername, max(actiondate) as last_login_date, ACTION, BY_WHOM 
FROM history 
WHERE action='login' 
GROUP BY computername
ORDER BY last_login_date desc

Open in new window

I don't see how that could run in SQL Server, as "BY_WHOM" is not valid in the SELECT with that GROUP BY (column "ACTION" is not valid either, but you can get around that in this case by hard-coding 'login' as the "ACTION" column in the SELECT list).
Think I know what you are looking for, now this does assume that ID is an identity field and that ActionDate is inserted in order ie (ID 5 will have a later date/time then 4)
SELECT computername COMPUTERNAME, ActionDate LAST_ACTION_DATE,Action ACTION,Username BY_WHOM from history where id in (Select max(ID) from history group by computername) order by 2

Open in new window

>>I want something like this:
COMPUTERNAME      LAST_ACTION_DATE            ACTION         BY_WHOM
COMPANYPC1            Jan 7 2014 12:06pm           login              mary
COMPANYPC2            Jan 3 2014 08:20am            logout           john<<
Those dates and times don't match anything in the original sample data, so it's difficult to see how a username gets connected to them. Also, this shows an action of logout when the max date and time is based only on login actions. I believe ScottPletcher is on the right track, but it might help to provide some realistic sample data and what you expect to see as a result from that data.
Yeah I can see how my fabricated data and the altered column names are making this very confusing. Here is some real data:

id    username     computername               action         actiondate
------------------------------------------------------------------------------------
333	mary		COMPANYPC21		login	2014-05-18 13:26:35
332	johndoe		COMPANYPC11		login	2014-05-18 13:25:15
331	neil		COMPANYPC03		login	2014-05-18 12:58:19
330	peter		COMPANYPC16		logout	2014-05-18 12:55:13
329	neil		COMPANYPC03		logout	2014-05-18 12:45:52
328	neil		COMPANYPC03		login	2014-05-18 11:18:53
327	george		COMPANYPC08		login	2014-05-18 11:13:26
326	george		COMPANYPC08		logout	2014-05-18 11:09:14
325	george		COMPANYPC03		logout	2014-05-18 11:04:26
324	derek		COMPANYPC18		login	2014-05-18 11:04:17
323	derek		COMPANYPC18		logout	2014-05-18 11:01:29
322	julie		COMPANYPC24		logout	2014-05-18 10:55:54
321	julie		COMPANYPC24		login	2014-05-18 10:26:58
320	julie		COMPANYPC24		logout	2014-05-18 10:21:33
319	julie		COMPANYPC24		login	2014-05-18 10:10:05
318	julie		COMPANYPC24		logout	2014-05-18 09:50:14
317	jim		COMPANYPC17		login	2014-05-18 09:39:46
316	peter		COMPANYPC16		login	2014-05-18 09:32:18
315	jerry		COMPANYPC13		login	2014-05-18 09:15:49
314	mary		COMPANYPC21		login	2014-05-18 09:12:14
313	johndoe		COMPANYPC30		login	2014-05-18 09:11:21

Open in new window


Here's my current real query:

SELECT 
     computername,
     datetime(max(actiondate), 'localtime') as last_action_date, 
     action as action, 
     username as by_whom
FROM history
WHERE actiondate > julianday(datetime('now'))-365
GROUP BY computername
ORDER BY computername 
LIMIT 100

Open in new window


And basically what I'm looking for is to get a list of all the computers, and each record showing the most recent activity that happened for that computer, e.g.

(this is fabricated)

computername        last_action_date            action     by_whom
--------------------------------------------------------------------
COMPANYPC03        2014-05-18 12:58:19      login      neil
COMPANYPC08        2014-05-18 11:13:26      login      george
COMPANYPC11        2014-05-18 13:25:15      login      johndoe
COMPANYPC16        2014-05-18 12:55:13      logout    peter
COMPANYPC18        2014-05-18 11:04:17      login      derek
COMPANYPC21        2014-05-18 13:26:35      login      mary       
...etc

Open in new window


From the looks of it I THINK that the query I posted above is actually working. But I don't know if that's just a fluke and I should really be using a nested query like what Randy and Scott are suggesting
Not sure why your query works given that the use of aggregate functions (e.g. max) requires a group by clause with all non-aggregate columns in the select statement. In this case your query should have grouped by computername, action, and username and not just computername. I'm not sure why the list of what you're looking for doesn't include jerry, jim, or julie although I suspect they are part of the ... etc. since you indicated the list as being fabricated. At any rate, I think the following might get you what you want:
select computername, action_date as last_action_date, action, by_whom from
(select computername, action_date, action, username as by_whom,
 row_number() over (partition by computername order by action_date desc) rn
 from history
 where action_date > julianday(datetime('now'))-365
) as x
where x.rn = 1
order by computername;
Not sure why your query works given that the use of aggregate functions (e.g. max) requires a group by clause with all non-aggregate columns in the select statement.
That's what I'm wondering too. The inclusion of the "action" and "username" fields seems ambiguous. If it is grouping by the "computername" field only, how does it know which record in the group it should use for the values of the "action" and "username" fields? Does it somehow intelligently figure it out because the max() aggregate function being used for action_date? Or is it just a fluke and it's returning the first record available, whether or not it makes logical sense to do so?

I'm having trouble with the query you posted, though. I'm getting a "Near "(": syntax error(1)"

It looks like it doesn't like the nested query.  Are you sure sub-queries in the FROM clause are supported in SQlite? All the examples I've found involve using a subquery in the WHERE clause, not the FROM clause

Here's my full query:

select 
     computername, datetime(max(actiondate), 'localtime') as last_action_date, action, username as by_whom 
from 
    ( select 
             computername, action_date, action, username, 
             row_number() over (partition by computername order by action_date desc) rn 
       from history 
       where action_date > julianday(datetime('now'))-365
    ) as x 
where x.rn = 1 
order by computername;

Open in new window

I'm not familiar with sqlite and perhaps the problem is that it doesn't recognize the row_number() function. What does the following query produce?
select computername, action_date, row_number() over (partition by computername order by action_date) as rn from history