Solved

How to limit distinct or grouped results

Posted on 2011-02-11
10
544 Views
Last Modified: 2012-05-11
Using SQLLite, I'm trying to limit a result set to an arbitrary number of rows. That is, if I have something like:

Theatre   Show               Time
1A            ExpertZilla    10:00 am
1A            ExpertZilla    11:00 am
1A            ExpertZilla    12:00 pm
1A            SQLHorrors     01:00 pm
1A            ExpertZilla    02:00 pm
1A            ExpertZilla    03:00 pm
1A            ExchangeThra   04:00 pm
1A            ExchangeThra   05:00 pm
1A            ExchangeThra   06:00 pm
1A            ExchangeThra   07:00 pm
1A            ExchangeThra   08:00 pm
1A            ExchangeThra   09:00 pm
1A            SQLHorrors     10:00 pm
1A            SQLHorrors     12:00 pm

What I'm looking for is: for any given theatre, for any given show, the top N times.
(I'll add 'start time' to my query later, for simplicity just assume the first N).

So, if N=3  I want to see:
1A            ExpertZilla    10:00 am
1A            ExpertZilla    11:00 am
1A            ExpertZilla    12:00 pm
1A            SQLHorrors     10:00 pm
1A            SQLHorrors     12:00 pm
1A            SQLHorrors     01:00 pm
1A            ExchangeThra   04:00 pm
1A            ExchangeThra   05:00 pm
1A            ExchangeThra   06:00 pm

I was thinking of a limit on a sub-select, butcan't quite work out how to get it to do what I need.

TIA.

EdB
0
Comment
Question by:edbored
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 40

Expert Comment

by:Sharath
Comment Utility
If you are working in SQL Server 2005 or later, you can use ROW_NUMBER.
select Theatre,Show,Time
  from (
select Theatre,Show,Time,
       row_number() over (partition by Theatre,Show order by Time) rn
  from your_table) t1
 where rn<=3

Open in new window

0
 
LVL 40

Accepted Solution

by:
Sharath earned 250 total points
Comment Utility
If you are working in SQL Server 2000 or MySQL, you need to implement ROW_NUMBER like this.
select Theatre,Show,Time
  from (
select Theatre,Show,Time,
       (select count(*) from your_table as t1 
         where t1.Theatre = t2.Theatre and t1.Show = t2.Show and t1.Time <= t2.Time) as rn
  from your_table as t2) as t3
 where rn <= 3

Open in new window

0
 
LVL 1

Author Comment

by:edbored
Comment Utility
I oversimplified...  I think the second form may work, but the performance is dismal.

There's actually a number of tables involved.  Here's the actual code (train schedule not theatres, but was easier to describe as theatres earlier!)

I created a view around a single stop_id (station).

Performance isn't great - see anything I could do different?

BTW - thanks for very quick response!

create view if not exists rlist  as 
SELECT          stoptimes.stop_id, 
                routes.route_id,
                Routes.route_long_name,
                Routes.route_short_name,
                Trips.trip_headsign,
                Stoptimes.departure_time
           FROM Trips
                JOIN StopTimes
                  ON Trips.trip_id = StopTimes.trip_id
                JOIN Routes
                  ON Routes.route_id = Trips.route_id
          WHERE StopTimes.stop_id = '15930'
               -- AND
               -- StopTimes.departure_time >= time( 'now', 'localtime', '-550 minutes' )
               -- AND
               -- StopTimes.departure_time <= time( 'now', 'localtime', '-460 minutes' )
          ORDER BY StopTimes.departure_time ;

--select * from rlist;

select stop_id, route_id, route_long_name, route_short_name, trip_headsign, departure_time
  from (
select stop_id, route_id, route_long_name, route_short_name, trip_headsign, departure_time,
       (select count(*) from rlist as t1 
         where t1.route_id=t2.route_id 
         and   t1.route_long_name=t2.route_long_name 
         and   t1.route_short_name=t2.route_short_name
         and   t1.trip_headsign=t2.trip_headsign
         and   t1.departure_time<=t2.departure_time) as rn
  from rlist as t2) as t3
 where rn <= 3
 order by departure_time

Open in new window

0
 
LVL 8

Expert Comment

by:raulggonzalez
Comment Utility
Hi,

In sqlLite you have the LIMIT clause available, the same as MySql ...

Have you tried it?? You wouldn't need the workaround with the count and all that... the performance will be boosted up

http://www.sqlite.org/lang_select.html

cheers
0
 
LVL 1

Author Comment

by:edbored
Comment Utility
Well, working with the same 'create view' as previous sample, I tried this:

select stop_id, 
       route_id, 
       route_long_name, 
       route_short_name, 
       trip_headsign, 
       departure_time
from (  select stop_id, 
               route_id, 
               route_long_name, 
               route_short_name, 
               trip_headsign, 
               departure_time
        from rlist as t1 
        where t1.route_id=t2.route_id 
        and   t1.route_long_name=t2.route_long_name 
        and   t1.route_short_name=t2.route_short_name
        and   t1.trip_headsign=t2.trip_headsign
        and   t1.departure_time<=t2.departure_time
        limit 3 
      ) as t2
 order by departure_time

Open in new window


I can't seem to figure out how to properly alias the main select in order to have T2.xxx recognized.


Thanks again...

EdB
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 8

Expert Comment

by:raulggonzalez
Comment Utility
I don't get why you want to do it in 2 steps when I guess you can combine ORDER BY and LIMIT in the same query and get the same result???

Have you tried this?


select stop_id,
               route_id,
               route_long_name,
               route_short_name,
               trip_headsign,
               departure_time
        from rlist as t1
        where t1.route_id=t2.route_id
        and   t1.route_long_name=t2.route_long_name
        and   t1.route_short_name=t2.route_short_name
        and   t1.trip_headsign=t2.trip_headsign
        and   t1.departure_time<=t2.departure_time
order by departure_time limit 3
         


http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/

Cheers
0
 
LVL 1

Author Comment

by:edbored
Comment Utility
That was one of the first things I tried (without the VIEW).  

It only returns 3 records.

I'm trying to return a max of n (in this case 3) records for each stop_id.

In the simpler example of the theatre, I want to see a max of 3 records per show for a particular theatre.

That is, for each theatre - the next three showings of each film.

For train stations, the next three trains to arrive at a station (regardless of final destination).

Thx.

EdB
0
 
LVL 40

Expert Comment

by:Sharath
Comment Utility
>> In the simpler example of the theatre, I want to see a max of 3 records per show for a particular theatre.

There is no other way until you generate a row number like I mentioned.
0
 
LVL 3

Assisted Solution

by:paulwquinn
paulwquinn earned 250 total points
Comment Utility
The first issue is the format you are using to store your showtimes: SQLite doesn't have a specific storage class for dates and/or times. They're normally stored as TEXT, REAL or INTEGER values. I assume you are using TEXT. SQLite date and time functions don't use/recognize 12-hour time strings with AM/PM, so if by "TOP" times you mean the first/next three times that will occur, it's a problem. "02:00 pm" will be listed before "10:00 am" if you order on the time column in a query. Similarly '12:00 pm' will be listed after every other time, including '02:00 pm', '03:00 pm', etc. You'll either have to extend SQLite with your own custom function or order the results elsewhere in your application.

Assuming you can ignore the above problem (a BIG and probably erroneous assumption... :^) ), we turn to the problem at hand. Unfortunately, LIMIT clauses can only appear at the end of an entire compound select statement. This means they can't be used with a GROUP BY clause to limit the number of rows returned for each theatre/show pair. They can't even be used within the simple SELECT clauses of a compound SELECT constructed using the UNION operator.

One posible alternative to achieve the result you're looking for is to use SQL to create SQL and build a series of simple SELECTs that you can run and then concatenate the result-sets together yourself elsewhere, e.g. in your application.

For example, I've attached a SQLite SQL script (get3listings.sql0 that you can run to get the type of output that you desire in a file called '3listings.txt'. You'll have to customize the table and column names appropriately for your schema.You can then (obviously) post-process (e.g. read) the file into your application. Conversely, you can use something like the SQLite C/C++ Interface to do the same thing completely inside your application code, i.e. no files required. If you're handling everything inside your own code, you could, of course,  simply read in the entire table in the desired order (ORDER BY theatre,show), then skip the records you don't want in your code.

 get3listings.sql
0
 
LVL 1

Author Closing Comment

by:edbored
Comment Utility
Split points - first (Sharath) worked, but dismal performance (not the poster's fault though).

Second (paulwquinn) would probably work quite nicely, but not practical in this particular implementation.

Thanks to both.
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Suggested Solutions

Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Viewers will learn how the fundamental information of how to create a table.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now