Link to home
Start Free TrialLog in
Avatar of edbored
edbored

asked on

How to limit distinct or grouped results

Using SQLLite, I'm trying to limit a result set to an arbitrary number of rows. That is, if I have something like:

Theatre   Show               Time
1A            ExpertZilla    10:00 am
1A            ExpertZilla    11:00 am
1A            ExpertZilla    12:00 pm
1A            SQLHorrors     01:00 pm
1A            ExpertZilla    02:00 pm
1A            ExpertZilla    03:00 pm
1A            ExchangeThra   04:00 pm
1A            ExchangeThra   05:00 pm
1A            ExchangeThra   06:00 pm
1A            ExchangeThra   07:00 pm
1A            ExchangeThra   08:00 pm
1A            ExchangeThra   09:00 pm
1A            SQLHorrors     10:00 pm
1A            SQLHorrors     12:00 pm

What I'm looking for is: for any given theatre, for any given show, the top N times.
(I'll add 'start time' to my query later, for simplicity just assume the first N).

So, if N=3  I want to see:
1A            ExpertZilla    10:00 am
1A            ExpertZilla    11:00 am
1A            ExpertZilla    12:00 pm
1A            SQLHorrors     10:00 pm
1A            SQLHorrors     12:00 pm
1A            SQLHorrors     01:00 pm
1A            ExchangeThra   04:00 pm
1A            ExchangeThra   05:00 pm
1A            ExchangeThra   06:00 pm

I was thinking of a limit on a sub-select, butcan't quite work out how to get it to do what I need.

TIA.

EdB
Avatar of Sharath S
Sharath S
Flag of United States of America image

If you are working in SQL Server 2005 or later, you can use ROW_NUMBER.
select Theatre,Show,Time
  from (
select Theatre,Show,Time,
       row_number() over (partition by Theatre,Show order by Time) rn
  from your_table) t1
 where rn<=3

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Sharath S
Sharath S
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of edbored
edbored

ASKER

I oversimplified...  I think the second form may work, but the performance is dismal.

There's actually a number of tables involved.  Here's the actual code (train schedule not theatres, but was easier to describe as theatres earlier!)

I created a view around a single stop_id (station).

Performance isn't great - see anything I could do different?

BTW - thanks for very quick response!

create view if not exists rlist  as 
SELECT          stoptimes.stop_id, 
                routes.route_id,
                Routes.route_long_name,
                Routes.route_short_name,
                Trips.trip_headsign,
                Stoptimes.departure_time
           FROM Trips
                JOIN StopTimes
                  ON Trips.trip_id = StopTimes.trip_id
                JOIN Routes
                  ON Routes.route_id = Trips.route_id
          WHERE StopTimes.stop_id = '15930'
               -- AND
               -- StopTimes.departure_time >= time( 'now', 'localtime', '-550 minutes' )
               -- AND
               -- StopTimes.departure_time <= time( 'now', 'localtime', '-460 minutes' )
          ORDER BY StopTimes.departure_time ;

--select * from rlist;

select stop_id, route_id, route_long_name, route_short_name, trip_headsign, departure_time
  from (
select stop_id, route_id, route_long_name, route_short_name, trip_headsign, departure_time,
       (select count(*) from rlist as t1 
         where t1.route_id=t2.route_id 
         and   t1.route_long_name=t2.route_long_name 
         and   t1.route_short_name=t2.route_short_name
         and   t1.trip_headsign=t2.trip_headsign
         and   t1.departure_time<=t2.departure_time) as rn
  from rlist as t2) as t3
 where rn <= 3
 order by departure_time

Open in new window

Hi,

In sqlLite you have the LIMIT clause available, the same as MySql ...

Have you tried it?? You wouldn't need the workaround with the count and all that... the performance will be boosted up

http://www.sqlite.org/lang_select.html

cheers
Avatar of edbored

ASKER

Well, working with the same 'create view' as previous sample, I tried this:

select stop_id, 
       route_id, 
       route_long_name, 
       route_short_name, 
       trip_headsign, 
       departure_time
from (  select stop_id, 
               route_id, 
               route_long_name, 
               route_short_name, 
               trip_headsign, 
               departure_time
        from rlist as t1 
        where t1.route_id=t2.route_id 
        and   t1.route_long_name=t2.route_long_name 
        and   t1.route_short_name=t2.route_short_name
        and   t1.trip_headsign=t2.trip_headsign
        and   t1.departure_time<=t2.departure_time
        limit 3 
      ) as t2
 order by departure_time

Open in new window


I can't seem to figure out how to properly alias the main select in order to have T2.xxx recognized.


Thanks again...

EdB
I don't get why you want to do it in 2 steps when I guess you can combine ORDER BY and LIMIT in the same query and get the same result???

Have you tried this?


select stop_id,
               route_id,
               route_long_name,
               route_short_name,
               trip_headsign,
               departure_time
        from rlist as t1
        where t1.route_id=t2.route_id
        and   t1.route_long_name=t2.route_long_name
        and   t1.route_short_name=t2.route_short_name
        and   t1.trip_headsign=t2.trip_headsign
        and   t1.departure_time<=t2.departure_time
order by departure_time limit 3
         


http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/

Cheers
Avatar of edbored

ASKER

That was one of the first things I tried (without the VIEW).  

It only returns 3 records.

I'm trying to return a max of n (in this case 3) records for each stop_id.

In the simpler example of the theatre, I want to see a max of 3 records per show for a particular theatre.

That is, for each theatre - the next three showings of each film.

For train stations, the next three trains to arrive at a station (regardless of final destination).

Thx.

EdB
>> In the simpler example of the theatre, I want to see a max of 3 records per show for a particular theatre.

There is no other way until you generate a row number like I mentioned.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of edbored

ASKER

Split points - first (Sharath) worked, but dismal performance (not the poster's fault though).

Second (paulwquinn) would probably work quite nicely, but not practical in this particular implementation.

Thanks to both.