Link to home
Start Free TrialLog in
Avatar of plq
plqFlag for United Kingdom of Great Britain and Northern Ireland

asked on

"Recordset" Paging in Sql Server

In a three tiered app we use the middle tier to implement recordset paging using standard ADO recordset positioning.

Each user can see 200 rows at a time, and they have First, Next, Prev, Last buttons to move to the next set of rows.

The problem is that if we select 1,000,000 rows the whole lot comes down to the application server.

Therefore I want to implement paging on the sql server instead. I don't mind running the query again, its the bandwidth on the resultset thats the problem.

I know we can do

      select top 200 ......

but how can we do

      select mid(startrow, numberofrows) .....

SOLUTION
Avatar of solution46
solution46

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of miron
slightly more robust would be to use a common primary key in conjunction with temptable, identity, and clustered index. Once the rows with the desired primary key values selected, narrow down the needed section by selecting the page interval as suggested above.

The trouble is for 1,000,000 records and an integer PK you will waste 8MB of precious contiguous pages in tempdb, and of you have 50 users it will translate in 1/2 gigabyte of this space *per request*. You can reduce the usage by half by adding identity as a common primary key, not sure if this is an option. What more can be said. A more robust server based solution would be off line. Let me know if you are interested.

As an aside, you should not be using client side index, as implied in the initial question. Switch to the

1 keyset cursor type
2 use recordset to position on the record needed

But, if it escaped your eye - this will entail, albeit behind the scenes, the very same work I just described above. Yes, it will be slightly more robust due to internal routines and no catalog used to mark the used pages.. the point is this is sub - optimal.

-- cheers.
Avatar of plq

ASKER

Thanks

I thought about the temp table thing and its just not going to scale well. I really something like oracle's "rownum" concept where you can simply say

    select * from (select ... complex sql statement ....) where rownum between 201 and 400

the queries are all adhoc, very complex sql (many joins) and can be sorted by any field so relying on existing ids can't work.

The recordsets are all aduseclient with adopenforwardonly

I had one idea which is very simple and actually sounding rather good:

    If a user wants records 601 to 800, then I only need to select the top 800, so I can use the top predicate for that.

The means that most of the queries will scale well even though I'm selecting 800 rows instead of 200, thats much better than selecting 1,000,000.

Avatar of BillAn1
BillAn1

You can do it with 2 TOPs -

SEELCT * FROM (
SELECT TOP 100 * FROM
(SELECT TOP 400 * FROM MyTable ORDER BY SortCol)
ORDER BY SortCol DESC
) ORDER BY SortCol

this will get rows 1-400, then sort these in descending order, return the top 100 of these, i.e. rows 400-300, then resort them back again as 300-400

ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of plq

ASKER

That syntax isn't supported in sql ??
Yes it is - I just ran it OK- what exact SQL are you running? What error are you getting?
Avatar of plq

ASKER

Oops sorry you are right. Small mistake on my part.

So moving on... the problem with your select will be that the sortcol(s) always have to include a value unique to each record, otherwise there's no guarantee you'll get the right 200 rows if rows 1 to 200 have the same value in that col as rows 201 to 400. I might be able to work around that by adding a pk at the end of the order by clause...
yep, if you want to return to a table and get the same data (in the same order) you will need ot have some column(s) to uniquely sort the data, that's true.
well,

the drawback of top predicate is it does limit the "visible" part of the data selected.
In terms of scaling this is worse then temp table. adUseClient should not be a trouble as long as you set CusorType  to adOpenKeyset.

-- cheers
Avatar of plq

ASKER

>> the drawback of top predicate is it does limit the "visible" part of the data selected.

I didn't understand want you meant there. Do you mean that you can no longer see the total recordcount of the query ? I.e. you cant now say
          "20000 records selected, rows 201 to 400 are shown"
? That could be a problem for smaller installations who want to see the full range selected. For 10,000+ record installations they would prefer the scalability I think.

>> adOpenKeyset

Are you saying that adOpenKeyset is lower impact that adOpenForwardOnly ?

                rs = xcn.OpenRecordset(sSql, interop.adodb.tlb.CursorTypeEnum.adOpenForwardOnly)
                rs.PageSize = lPageSize
                rs.AbsolutePage = lPage
                For lRecord = 1 To rs.PageSize
                      ... data to xml

maybe my assertion that 1,000,000 rows are coming back to the server is wrong ?

>> Scalability.

I think you are right for the following reason. When you do that three level select scenario sql will have to write two temporary tables, one for each inner select, as part of the query plan. Although, at least the temp tables will only exist while a single query plan is executing.


I am erring towards my own idea of just taking the upper row number and using "top" for that. Most people will only look at the first few pages in this app. The fact that it is so simple, and requires little retesting, count for a lot here. The problem is not being able to present the total number of rows selected to the end user (unless I run the query twice, once for a count and again for the data).

You can return them both the count easily...

either add an Output parameter @RowCount or use a return value. Then add the following code to the bottom of your SQL sproc...

SELECT @RowCount = Count(*) FROM whatever...             This will set the output parameter, or you could then RETURN @RowCount if you want to use a return value.

A third option is to just add the following code at the end...
SELECT Count(*) FROM Whatever... and have two recordsets being passed back. There is a method (.NextRecordset or something similar) on the connection object that you can use to move to the next recordset.

Regards,

s46.
Thinking about it, you could just return the top 1000, and then add a message to the user..

'more than 1,000 rows were returned by your query. Please refine your criteria and try again'

or comething similar. As has been mentioned here already (and I think you confirmed it in your last post) nobody is actually going to read through the majority of the returned data, so why bother chewing up bandwidth displaying it?

s46.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of plq

ASKER

I'm actually going to stick with using TOP and let the mid tier take off the first lot of columns. By selecting the top 801 when users require rows 601 to 800 I can tell whether or not to show a "Next >>" button in the client.

I've accepted the answer that I thought would be most useful to other developers needing this.

thanks for all your help