• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 341
  • Last Modified:

Grabbing latest 4 records of same number with SQL

Hi I have a table of million records and the sample data would be like bellow:

TIMESTAMP               |CUSTOMER NAME
21/01/2013 22:22:23| Johan
21/01/2013 22:22:21| Johan
21/01/2013 22:22:25| Johan
21/01/2013 22:22:26| Johan
21/01/2013 22:22:27| Johan
21/01/2013 22:22:28| Johan
21/01/2013 22:22:29| Johan
21/01/2013 22:22:23| Mike
21/01/2013 22:22:21| Mike
21/01/2013 22:22:25| Mike
21/01/2013 22:22:26| Mike
21/01/2013 22:22:27| Mike
21/01/2013 22:22:28| Mike
21/01/2013 22:22:29| Mike

so what I need is a query in postgres sql where it should grab the latest 3 records of JOHN/MIKE from table, by latest what I mean is that it should select only 3 records of them that recorded recently

thank
0
hi4ppl
Asked:
hi4ppl
  • 10
  • 10
1 Solution
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
Checking the PostGreSQL documentation, as it has ROW_NUMBER() window function, you can use the syntax specified in this article to do this:
http://www.experts-exchange.com/Database/Miscellaneous/A_3203-DISTINCT-vs-GROUP-BY-and-why-does-it-not-work-for-my-query.html

select *
  from ( select CUSTOMER_NAME,  TIMESTAMP          
                 , ROW_NUMBER() OVER ( PARTITION BY CUSTOMER_NAME ORDER BY  TIMESTAMP          DESC ) rn
            FROM yourtable
 ) sq
where sq.rn <= 3 

Open in new window


a index on customer_name + timestamp should make sure you get the best performance.
0
 
hi4pplAuthor Commented:
Hi,

I have one table so there is no join involved here all data is in one flat table, anyway this is possible by query or procedure language.

thanks for help
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
my query example should do this, there is no join involved.
the article is indeed about "joins", but that's only the usual case when you have "duplicates".
0
Take Control of Web Hosting For Your Clients

As a web developer or IT admin, successfully managing multiple client accounts can be challenging. In this webinar we will look at the tools provided by Media Temple and Plesk to make managing your clients’ hosting easier.

 
hi4pplAuthor Commented:
Well when I run this I get zero result but yeah I have added data as well but it's not giving any result.

select *
  from ( select address,name,  TIMESTAMP          
                 , ROW_NUMBER() OVER ( PARTITION BY address.name ORDER BY  TIMESTAMP DESC ) rn
            FROM tablename
 ) sq
where sq.rn <= 3 and to_char(timestamp,'DD-MM-YYYY')='01-12-2012'

Open in new window


thanks for help
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
this query will only return results if users have such timestamp values as top 3.
not sure if that will be really the case?!
0
 
hi4pplAuthor Commented:
Well partially yes but I would like to add to this that what I want is that go grab the latest three record of that username he might have done 1000 transaction but I want to grab the latest three from that day and im not sure if it does that... because there is no way that I don't have latest 3 record out million record which this query returns zero .

thanks
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
then you have to move that condition to the subquery
select *
  from ( select address,name,  TIMESTAMP          
                 , ROW_NUMBER() OVER ( PARTITION BY address.name ORDER BY  TIMESTAMP DESC ) rn
            FROM tablename
       WHERE to_char(timestamp,'DD-MM-YYYY')='01-12-2012'
 ) sq
where sq.rn <= 3   

Open in new window

0
 
hi4pplAuthor Commented:
Well it might be a dummy question but this gives me out of like following: I didn't understand what it means? sorry im not that advance

address      | Name|TIME      | rn
somwhere |          | sometime | 2
somwhere |          | sometime | 1
somwhere |          | sometime | 3

so what does it means the main thing is which is name it does not shows and plus taking example above I given in my question I need something like bellow output:


Somaddress|21/01/2013 22:22:23| Johan
Somaddress|21/01/2013 22:22:21| Johan
Somaddress|21/01/2013 22:22:25| Johan
Somaddress|21/01/2013 22:22:23| Mike
Somaddress|21/01/2013 22:22:21| Mike
Somaddress|21/01/2013 22:22:25| Mike

above output is my goal
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
you can do that "finetuning" like this.
select address, timestamp, name
  from ( select address,name,  TIMESTAMP          
                 , ROW_NUMBER() OVER ( PARTITION BY address.name ORDER BY  TIMESTAMP DESC ) rn
            FROM tablename
       WHERE to_char(timestamp,'DD-MM-YYYY')='01-12-2012'
 ) sq
where sq.rn <= 3   
order by sq.name, rn 

Open in new window

0
 
hi4pplAuthor Commented:
Hi thanks I get this error it says the column sq.name does not exist what exactly that column is? thanks
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
I presume you don't use the sql as I wrote it ...

the "sq." refers to the subquery alias ...
0
 
hi4pplAuthor Commented:
Hi no I run it as it's no changes the only things I do is changing the fields and database name to it's actual
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
let's see... you put FROM tablename, but PARTITION BY address.name ...
so, there MUST be something you change in the query ... or you need to change  ...

select *
  from ( select address, name,  TIMESTAMP          
                 , ROW_NUMBER() OVER ( PARTITION BY address.name ORDER BY  TIMESTAMP DESC ) rn
            FROM address
    WHERE to_char(timestamp,'DD-MM-YYYY')='01-12-2012'
 ) sq
where rn <= 3
order by name, rn  

Open in new window


or, you changed the "." between address.name there into a comma (which is wrong in regards to the required results)


select *
  from ( select address, name,  TIMESTAMP          
                 , ROW_NUMBER() OVER ( PARTITION BY name ORDER BY  TIMESTAMP DESC ) rn
            FROM yourtablename
    WHERE to_char(timestamp,'DD-MM-YYYY')='01-12-2012'
 ) sq
where rn <= 3
order by name, rn  

Open in new window

0
 
hi4pplAuthor Commented:
still it does not give me the result that I was asking I m running the same query but changing the files, I cannot post the exact fields if there is way I can send you PM? I'm not sure if there is any?
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
Check my profile
0
 
hi4pplAuthor Commented:
Hi I have manage to make the changes on this but when I run I get the following I don't even know what RN means here I thought it shows number of records that this account have but when I check on the same date it has only one which shows the output.

amount	accountnumber	timestamp	rn
200	          4872036	1/1/2013 10:38	1
200	          4903993	1/1/2013 10:31	2
200	          2442270	1/1/2013 10:30	3
3000	7728031	        1/1/2013 23:59	1
3000	2161910	        1/1/2013 23:59	2
5000	7145383	        1/1/2013 23:58	3
9900	2145034	       1/1/2013 12:04	1

Open in new window


with the bellow changes on the query

select *
  from ( select amount, accountnumber,  TIMESTAMP          
                 , ROW_NUMBER() OVER ( PARTITION BY amount ORDER BY  TIMESTAMP DESC ) rn
            FROM accounts
    WHERE to_char(timestamp,'DD-MM-YYYY')='01-01-2013'
 ) sq
where rn <= 3
and amount>0 
order by amount, rn  

Open in new window


I'm not sure if I didn't make myself clear as there is misunderstand here but the result that i'm looking for from this would be bellow

amount	accountnumber	timestamp	
200	          4872036	1/1/2013 10:38	
500	          4872036	1/2/2013 10:38	
600	          4872036	1/3/2013 10:38	
200	          4903993	1/1/2013 10:31	
700	          4903993	1/3/2013 10:31	
600	          4903993	1/4/2013 10:31	

Open in new window


this should continue grabbing  3 records for each account the latest one  or the old one no matter how many records they have I need the three latest one

thanks for the help.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
so you want to PARTITION BY account, and not by amount ...
also, you want to move the AMOUNT> 0 condition to the subquery, I think

RN is the alias for the ROW_NUMBER() expression.

select *
  from ( select amount, accountnumber,  TIMESTAMP          
                 , ROW_NUMBER() OVER ( PARTITION BY accountnumber ORDER BY  TIMESTAMP DESC ) rn
            FROM accounts
    WHERE to_char(timestamp,'DD-MM-YYYY')='01-01-2013'
    and amount>0 
 ) sq
where rn <= 3
order by accountnumber , rn  

Open in new window

0
 
hi4pplAuthor Commented:
Okay thanks I got the result so the result that is coming in the rn what does that means like I have numbers their from 1-3 ?

thanks for help
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
when you remove this line:
where rn <= 3
you should see what it means...

in short, the ROW_NUMBER() function establies a "row number" per "partition" (in this case, the account num), based on the ORDER BY criteria.

the subquery is written to make it possible to filter on that expression, so take only the "first 3" .

if you don't want the rn field to appear, replace the SELECT * by the SELECT amount, accountnumber,  TIMESTAMP  in the outer query

select amount, accountnumber,  TIMESTAMP 
  from ( select amount, accountnumber,  TIMESTAMP          
                 , ROW_NUMBER() OVER ( PARTITION BY accountnumber ORDER BY  TIMESTAMP DESC ) rn
            FROM accounts
    WHERE to_char(timestamp,'DD-MM-YYYY')='01-01-2013'
    and amount>0 
 ) sq
where rn <= 3
order by accountnumber , rn  

Open in new window

0
 
hi4pplAuthor Commented:
thanks
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 10
  • 10
Tackle projects and never again get stuck behind a technical roadblock.
Join Now