Solved

"Recordset" Paging in Sql Server

Posted on 2004-09-17
15
984 Views
Last Modified: 2012-08-13
In a three tiered app we use the middle tier to implement recordset paging using standard ADO recordset positioning.

Each user can see 200 rows at a time, and they have First, Next, Prev, Last buttons to move to the next set of rows.

The problem is that if we select 1,000,000 rows the whole lot comes down to the application server.

Therefore I want to implement paging on the sql server instead. I don't mind running the query again, its the bandwidth on the resultset thats the problem.

I know we can do

      select top 200 ......

but how can we do

      select mid(startrow, numberofrows) .....

0
Comment
Question by:plq
  • 5
  • 4
  • 3
  • +1
15 Comments
 
LVL 9

Assisted Solution

by:solution46
solution46 earned 100 total points
ID: 12090779
I've not come across an easy way of doing this yet...

My best suggestion would be...
1. add a '@startrow' and '@endrow' parameter to the sproc
2. build a temp table to hold the data you want to output
3. open a cursor of the full rscordset and generate a counter.
4. loop through the recordset until counter = @startrow
5. loop through the recordset until counter = @endrow, this tim einserting each row into the temp table
6. stop the cursor
7. select * from the temp table.

This approach has two of my pet hates (cursors and temp tables) and I'm sure there has to be a neater method. However, if nobody posts a neater solution, this will at least work.

One thing you need to watch out for when paging like this: each 'select page' is entirely separate from the others and additions to the 'top' of the resultset could result in the same data being displayed on consecutive pages if the user moes through them in order.

Hope this helps,

s46.
0
 
LVL 9

Expert Comment

by:miron
ID: 12090855
slightly more robust would be to use a common primary key in conjunction with temptable, identity, and clustered index. Once the rows with the desired primary key values selected, narrow down the needed section by selecting the page interval as suggested above.

The trouble is for 1,000,000 records and an integer PK you will waste 8MB of precious contiguous pages in tempdb, and of you have 50 users it will translate in 1/2 gigabyte of this space *per request*. You can reduce the usage by half by adding identity as a common primary key, not sure if this is an option. What more can be said. A more robust server based solution would be off line. Let me know if you are interested.

As an aside, you should not be using client side index, as implied in the initial question. Switch to the

1 keyset cursor type
2 use recordset to position on the record needed

But, if it escaped your eye - this will entail, albeit behind the scenes, the very same work I just described above. Yes, it will be slightly more robust due to internal routines and no catalog used to mark the used pages.. the point is this is sub - optimal.

-- cheers.
0
 
LVL 8

Author Comment

by:plq
ID: 12091010
Thanks

I thought about the temp table thing and its just not going to scale well. I really something like oracle's "rownum" concept where you can simply say

    select * from (select ... complex sql statement ....) where rownum between 201 and 400

the queries are all adhoc, very complex sql (many joins) and can be sorted by any field so relying on existing ids can't work.

The recordsets are all aduseclient with adopenforwardonly

I had one idea which is very simple and actually sounding rather good:

    If a user wants records 601 to 800, then I only need to select the top 800, so I can use the top predicate for that.

The means that most of the queries will scale well even though I'm selecting 800 rows instead of 200, thats much better than selecting 1,000,000.

0
 
LVL 17

Expert Comment

by:BillAn1
ID: 12091073
You can do it with 2 TOPs -

SEELCT * FROM (
SELECT TOP 100 * FROM
(SELECT TOP 400 * FROM MyTable ORDER BY SortCol)
ORDER BY SortCol DESC
) ORDER BY SortCol

this will get rows 1-400, then sort these in descending order, return the top 100 of these, i.e. rows 400-300, then resort them back again as 300-400

0
 
LVL 17

Accepted Solution

by:
BillAn1 earned 200 total points
ID: 12091078
sorry, typo, plus you need to give aliases :

SELECT * FROM (
SELECT TOP 100 * FROM
(SELECT TOP 400 * FROM MyTableSrt ORDER BY SortCol) a1
ORDER BY SortCol DESC
) a2 ORDER BY SortCol
0
 
LVL 8

Author Comment

by:plq
ID: 12091153
That syntax isn't supported in sql ??
0
 
LVL 17

Expert Comment

by:BillAn1
ID: 12091197
Yes it is - I just ran it OK- what exact SQL are you running? What error are you getting?
0
[Webinar] Disaster Recovery and Cloud Management

Learn from Unigma and CloudBerry industry veterans which providers are best for certain use cases and how to lower cloud costs, how to grow your Managed Services practice in IaaS clouds, and how to utilize public cloud for Disaster Recovery

 
LVL 8

Author Comment

by:plq
ID: 12091233
Oops sorry you are right. Small mistake on my part.

So moving on... the problem with your select will be that the sortcol(s) always have to include a value unique to each record, otherwise there's no guarantee you'll get the right 200 rows if rows 1 to 200 have the same value in that col as rows 201 to 400. I might be able to work around that by adding a pk at the end of the order by clause...
0
 
LVL 17

Expert Comment

by:BillAn1
ID: 12091405
yep, if you want to return to a table and get the same data (in the same order) you will need ot have some column(s) to uniquely sort the data, that's true.
0
 
LVL 9

Expert Comment

by:miron
ID: 12092208
well,

the drawback of top predicate is it does limit the "visible" part of the data selected.
In terms of scaling this is worse then temp table. adUseClient should not be a trouble as long as you set CusorType  to adOpenKeyset.

-- cheers
0
 
LVL 8

Author Comment

by:plq
ID: 12094421
>> the drawback of top predicate is it does limit the "visible" part of the data selected.

I didn't understand want you meant there. Do you mean that you can no longer see the total recordcount of the query ? I.e. you cant now say
          "20000 records selected, rows 201 to 400 are shown"
? That could be a problem for smaller installations who want to see the full range selected. For 10,000+ record installations they would prefer the scalability I think.

>> adOpenKeyset

Are you saying that adOpenKeyset is lower impact that adOpenForwardOnly ?

                rs = xcn.OpenRecordset(sSql, interop.adodb.tlb.CursorTypeEnum.adOpenForwardOnly)
                rs.PageSize = lPageSize
                rs.AbsolutePage = lPage
                For lRecord = 1 To rs.PageSize
                      ... data to xml

maybe my assertion that 1,000,000 rows are coming back to the server is wrong ?

>> Scalability.

I think you are right for the following reason. When you do that three level select scenario sql will have to write two temporary tables, one for each inner select, as part of the query plan. Although, at least the temp tables will only exist while a single query plan is executing.


I am erring towards my own idea of just taking the upper row number and using "top" for that. Most people will only look at the first few pages in this app. The fact that it is so simple, and requires little retesting, count for a lot here. The problem is not being able to present the total number of rows selected to the end user (unless I run the query twice, once for a count and again for the data).

0
 
LVL 9

Expert Comment

by:solution46
ID: 12094795
You can return them both the count easily...

either add an Output parameter @RowCount or use a return value. Then add the following code to the bottom of your SQL sproc...

SELECT @RowCount = Count(*) FROM whatever...             This will set the output parameter, or you could then RETURN @RowCount if you want to use a return value.

A third option is to just add the following code at the end...
SELECT Count(*) FROM Whatever... and have two recordsets being passed back. There is a method (.NextRecordset or something similar) on the connection object that you can use to move to the next recordset.

Regards,

s46.
0
 
LVL 9

Expert Comment

by:solution46
ID: 12094808
Thinking about it, you could just return the top 1000, and then add a message to the user..

'more than 1,000 rows were returned by your query. Please refine your criteria and try again'

or comething similar. As has been mentioned here already (and I think you confirmed it in your last post) nobody is actually going to read through the majority of the returned data, so why bother chewing up bandwidth displaying it?

s46.
0
 
LVL 9

Assisted Solution

by:miron
miron earned 200 total points
ID: 12096376
>>>> "the drawback of top predicate is it does limit the "visible" part of the data selected.
>>>> In terms of scaling this is worse then temp table. "

>> I didn't understand want you meant there.


I think this is pretty clear. This is on how sql server actually implements the "top". The only way to "save" resources is to use a range select, using top only "hides" some othe rows selected by sql server.
0
 
LVL 8

Author Comment

by:plq
ID: 12105909
I'm actually going to stick with using TOP and let the mid tier take off the first lot of columns. By selecting the top 801 when users require rows 601 to 800 I can tell whether or not to show a "Next >>" button in the client.

I've accepted the answer that I thought would be most useful to other developers needing this.

thanks for all your help
0

Featured Post

Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Backing up an SQL Transaction Log 11 42
Microsoft Azure SQL - create a read only user 2 14
MSSQL: Replace text (typo) 7 33
SSRS troubles 4 20
Nowadays, some of developer are too much worried about data. Who is using data, who is updating it etc. etc. Because, data is more costlier in term of money and information. So security of data is focusing concern in days. Lets' understand the Au…
Introduction SQL Server Integration Services can read XML files, that’s known by every BI developer.  (If you didn’t, don’t worry, I’m aiming this article at newcomers as well.) But how far can you go?  When does the XML Source component become …
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

920 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now