Solved

"Recordset" Paging in Sql Server

Posted on 2004-09-17
15
981 Views
Last Modified: 2012-08-13
In a three tiered app we use the middle tier to implement recordset paging using standard ADO recordset positioning.

Each user can see 200 rows at a time, and they have First, Next, Prev, Last buttons to move to the next set of rows.

The problem is that if we select 1,000,000 rows the whole lot comes down to the application server.

Therefore I want to implement paging on the sql server instead. I don't mind running the query again, its the bandwidth on the resultset thats the problem.

I know we can do

      select top 200 ......

but how can we do

      select mid(startrow, numberofrows) .....

0
Comment
Question by:plq
  • 5
  • 4
  • 3
  • +1
15 Comments
 
LVL 9

Assisted Solution

by:solution46
solution46 earned 100 total points
ID: 12090779
I've not come across an easy way of doing this yet...

My best suggestion would be...
1. add a '@startrow' and '@endrow' parameter to the sproc
2. build a temp table to hold the data you want to output
3. open a cursor of the full rscordset and generate a counter.
4. loop through the recordset until counter = @startrow
5. loop through the recordset until counter = @endrow, this tim einserting each row into the temp table
6. stop the cursor
7. select * from the temp table.

This approach has two of my pet hates (cursors and temp tables) and I'm sure there has to be a neater method. However, if nobody posts a neater solution, this will at least work.

One thing you need to watch out for when paging like this: each 'select page' is entirely separate from the others and additions to the 'top' of the resultset could result in the same data being displayed on consecutive pages if the user moes through them in order.

Hope this helps,

s46.
0
 
LVL 9

Expert Comment

by:miron
ID: 12090855
slightly more robust would be to use a common primary key in conjunction with temptable, identity, and clustered index. Once the rows with the desired primary key values selected, narrow down the needed section by selecting the page interval as suggested above.

The trouble is for 1,000,000 records and an integer PK you will waste 8MB of precious contiguous pages in tempdb, and of you have 50 users it will translate in 1/2 gigabyte of this space *per request*. You can reduce the usage by half by adding identity as a common primary key, not sure if this is an option. What more can be said. A more robust server based solution would be off line. Let me know if you are interested.

As an aside, you should not be using client side index, as implied in the initial question. Switch to the

1 keyset cursor type
2 use recordset to position on the record needed

But, if it escaped your eye - this will entail, albeit behind the scenes, the very same work I just described above. Yes, it will be slightly more robust due to internal routines and no catalog used to mark the used pages.. the point is this is sub - optimal.

-- cheers.
0
 
LVL 8

Author Comment

by:plq
ID: 12091010
Thanks

I thought about the temp table thing and its just not going to scale well. I really something like oracle's "rownum" concept where you can simply say

    select * from (select ... complex sql statement ....) where rownum between 201 and 400

the queries are all adhoc, very complex sql (many joins) and can be sorted by any field so relying on existing ids can't work.

The recordsets are all aduseclient with adopenforwardonly

I had one idea which is very simple and actually sounding rather good:

    If a user wants records 601 to 800, then I only need to select the top 800, so I can use the top predicate for that.

The means that most of the queries will scale well even though I'm selecting 800 rows instead of 200, thats much better than selecting 1,000,000.

0
 
LVL 17

Expert Comment

by:BillAn1
ID: 12091073
You can do it with 2 TOPs -

SEELCT * FROM (
SELECT TOP 100 * FROM
(SELECT TOP 400 * FROM MyTable ORDER BY SortCol)
ORDER BY SortCol DESC
) ORDER BY SortCol

this will get rows 1-400, then sort these in descending order, return the top 100 of these, i.e. rows 400-300, then resort them back again as 300-400

0
 
LVL 17

Accepted Solution

by:
BillAn1 earned 200 total points
ID: 12091078
sorry, typo, plus you need to give aliases :

SELECT * FROM (
SELECT TOP 100 * FROM
(SELECT TOP 400 * FROM MyTableSrt ORDER BY SortCol) a1
ORDER BY SortCol DESC
) a2 ORDER BY SortCol
0
 
LVL 8

Author Comment

by:plq
ID: 12091153
That syntax isn't supported in sql ??
0
 
LVL 17

Expert Comment

by:BillAn1
ID: 12091197
Yes it is - I just ran it OK- what exact SQL are you running? What error are you getting?
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 8

Author Comment

by:plq
ID: 12091233
Oops sorry you are right. Small mistake on my part.

So moving on... the problem with your select will be that the sortcol(s) always have to include a value unique to each record, otherwise there's no guarantee you'll get the right 200 rows if rows 1 to 200 have the same value in that col as rows 201 to 400. I might be able to work around that by adding a pk at the end of the order by clause...
0
 
LVL 17

Expert Comment

by:BillAn1
ID: 12091405
yep, if you want to return to a table and get the same data (in the same order) you will need ot have some column(s) to uniquely sort the data, that's true.
0
 
LVL 9

Expert Comment

by:miron
ID: 12092208
well,

the drawback of top predicate is it does limit the "visible" part of the data selected.
In terms of scaling this is worse then temp table. adUseClient should not be a trouble as long as you set CusorType  to adOpenKeyset.

-- cheers
0
 
LVL 8

Author Comment

by:plq
ID: 12094421
>> the drawback of top predicate is it does limit the "visible" part of the data selected.

I didn't understand want you meant there. Do you mean that you can no longer see the total recordcount of the query ? I.e. you cant now say
          "20000 records selected, rows 201 to 400 are shown"
? That could be a problem for smaller installations who want to see the full range selected. For 10,000+ record installations they would prefer the scalability I think.

>> adOpenKeyset

Are you saying that adOpenKeyset is lower impact that adOpenForwardOnly ?

                rs = xcn.OpenRecordset(sSql, interop.adodb.tlb.CursorTypeEnum.adOpenForwardOnly)
                rs.PageSize = lPageSize
                rs.AbsolutePage = lPage
                For lRecord = 1 To rs.PageSize
                      ... data to xml

maybe my assertion that 1,000,000 rows are coming back to the server is wrong ?

>> Scalability.

I think you are right for the following reason. When you do that three level select scenario sql will have to write two temporary tables, one for each inner select, as part of the query plan. Although, at least the temp tables will only exist while a single query plan is executing.


I am erring towards my own idea of just taking the upper row number and using "top" for that. Most people will only look at the first few pages in this app. The fact that it is so simple, and requires little retesting, count for a lot here. The problem is not being able to present the total number of rows selected to the end user (unless I run the query twice, once for a count and again for the data).

0
 
LVL 9

Expert Comment

by:solution46
ID: 12094795
You can return them both the count easily...

either add an Output parameter @RowCount or use a return value. Then add the following code to the bottom of your SQL sproc...

SELECT @RowCount = Count(*) FROM whatever...             This will set the output parameter, or you could then RETURN @RowCount if you want to use a return value.

A third option is to just add the following code at the end...
SELECT Count(*) FROM Whatever... and have two recordsets being passed back. There is a method (.NextRecordset or something similar) on the connection object that you can use to move to the next recordset.

Regards,

s46.
0
 
LVL 9

Expert Comment

by:solution46
ID: 12094808
Thinking about it, you could just return the top 1000, and then add a message to the user..

'more than 1,000 rows were returned by your query. Please refine your criteria and try again'

or comething similar. As has been mentioned here already (and I think you confirmed it in your last post) nobody is actually going to read through the majority of the returned data, so why bother chewing up bandwidth displaying it?

s46.
0
 
LVL 9

Assisted Solution

by:miron
miron earned 200 total points
ID: 12096376
>>>> "the drawback of top predicate is it does limit the "visible" part of the data selected.
>>>> In terms of scaling this is worse then temp table. "

>> I didn't understand want you meant there.


I think this is pretty clear. This is on how sql server actually implements the "top". The only way to "save" resources is to use a range select, using top only "hides" some othe rows selected by sql server.
0
 
LVL 8

Author Comment

by:plq
ID: 12105909
I'm actually going to stick with using TOP and let the mid tier take off the first lot of columns. By selecting the top 801 when users require rows 601 to 800 I can tell whether or not to show a "Next >>" button in the client.

I've accepted the answer that I thought would be most useful to other developers needing this.

thanks for all your help
0

Featured Post

Free Gift Card with Acronis Backup Purchase!

Backup any data in any location: local and remote systems, physical and virtual servers, private and public clouds, Macs and PCs, tablets and mobile devices, & more! For limited time only, buy any Acronis backup products and get a FREE Amazon/Best Buy gift card worth up to $200!

Join & Write a Comment

I wrote this interesting script that really help me find jobs or procedures when working in a huge environment. I could I have written it as a Procedure but then I would have to have it on each machine or have a link to a server-related search that …
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now