Solved

What are the performance implications of using derived tables?

Posted on 2004-10-09
10
948 Views
Last Modified: 2012-05-05
In the ASP.NET section of experts exchange, I have asked a question about how to get the (for example) 200th through 250th row of a result set in a MSSQL database.  The actual table is very large, and has duplicate key values, so I am struggling with how to page through it.  

I believe that temporary tables (create a table with an identity value) are too high overhead, because the base table has 4 million records.

The following query has been suggested.  I am not familiar with the DERIVED TABLE syntax, and I am not sure what the performance implications would be.  

Can you please comment on how efficient this query would be on a 4M row table, and suggest a better method if you know of one?

Here's the query:

SELECT * FROM(SELECT TOP 50 * FROM (SELECT TOP 250 * FROM (SELECT TOP 250 * FROM tablename ORDER BY field) DERIVEDTBL ORDER BY field DESC) DERIVEDTBL) DERIVEDTBL ORDER BY field

The actual query would have where clauses in it to retreive only rows that match a particular key or key range, but there could be thousands of records returned.  I am trying to avoid a several thousand row result set for performance reasons.

Will the query optimizer figure out what this means and do the right thing, or will it it create two temporary tables, and if so, should I worry about the overhead of doing this?

Thanks,

Kevin
0
Comment
Question by:kevinmackenzie
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 6

Expert Comment

by:robertjbarker
ID: 12267696
DERIVEDTBL is not a key word. It could just as well be "ABC" and the query you have would act just the same.

The first thing that is going to happen is that "(SELECT TOP 250 * FROM tablename ORDER BY field)" will be executed. From then on you are just working with 250 rows, and that is pretty small.

Getting that 250 rows is the thing to optimize, and the first thing to do there is to make sure "field" is indexed.  If it is not, you will do a full table scan to search through every value of that field.
0
 

Author Comment

by:kevinmackenzie
ID: 12268121
Thank you.  Is this syntax (without the DERIVEDTBL) the most efficient way to do this, or would creating a temporary table (more than one SQL statement) be better, or do you have another suggestion?

In other words, is this the best way to do this from a  performance standpoint, in your opinion?

Kevin
0
 
LVL 34

Expert Comment

by:arbert
ID: 12268879
Have you looked at the option of using ADO.NET paging?  Granted it has its own set of overhead and limitations, but it is an option.
0
Get MySQL database support online, now!

At Percona’s web store you can order your MySQL database support needs in minutes. No hassles, no fuss, just pick and click. Pay online with a credit card.

 

Author Comment

by:kevinmackenzie
ID: 12268891
On ADO.NET paging, yes I have looked at it.  ASP.NET apparently does not completely expose the interface, as the only thing I can find is a Dataset, which reads every row that satisfies the query into memory on the server, which is unacceptable on a large table.  I'm told that ASP offered better services in this regard, perhaps later versions of ASP.NET will catch up.

Thanks for the suggestion.

Kevin
0
 

Author Comment

by:kevinmackenzie
ID: 12268899
Razo,

I also found, when actually trying to use this syntax today, that I couldn't make it work in a complete application without knowing what the total number of records in the table is.  So using it requires two queries (not just two select statements).  The first, a COUNT(*) query, and the second the line shown previously in this question.  Why two?  Because the argument for the TOP clause cannot be a variable. So I have to do the first query, then build the string for the second SQL command.

So, for you database experts out there, how much overhead is there to a COUNT(*) operation?  I suspect it is a complete pass of the database, starting at the first indexed point in the table, and continuing to the upper bound of the WHERE clause.  That starts to make this method unacceptable...

Kevin
0
 
LVL 34

Accepted Solution

by:
arbert earned 500 total points
ID: 12268962
Take a quick look at the MS document on .NET paging (Sorry if you've already looked at it--not trying to waste your time).  Notice, in the document, they have a stored proc that runs database side to get the result set.  The only downside is the dynamic SQL.  However, I don't think that's any worse than running a query twice (once for the count, and again to use the count):

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag/html/scalenethowto05.asp

Brett
0
 
LVL 34

Expert Comment

by:arbert
ID: 12268963
(oh ya, you're right, the future is going to be much better with SQL2005)
0
 
LVL 6

Expert Comment

by:robertjbarker
ID: 12269244
What I meant was that the identifier for the intermediate results could be anything, for example:

SELECT * FROM(SELECT TOP 50 * FROM (SELECT TOP 250 * FROM (SELECT TOP 250 * FROM tablename ORDER BY field) ABC ORDER BY field DESC) ABC) ABC ORDER BY field

The identifiers still have to be there, or there is a syntax error.

Anyway, I am not an ASP kind of guy. But from an SQL viewpoint, I can't think of a better way to do what you want, as long as you have an index defined for each "field" you intend to use.
0
 
LVL 8

Expert Comment

by:plq
ID: 12269316
Something I found on another thread which might be useful for this

select top 250 *, identity(int, 1,1) 'fred'
into #temp
from asset

select * from #temp where fred > 200

drop table #temp


this returns rows 201 to 250


Personally I just use 'top' and cut out records 1 to 200 using rs.absoluteposition
0
 

Author Comment

by:kevinmackenzie
ID: 12271082
Arbert,

I had not found that help file, and it is exactly what I need.

For anyone else looking into this problem, the solution I eventually decided to use is located here:

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=27297

and is referred to as "Jeff's solution".

Here is his pseudo code:

-- for each column in your sort, you need a variable to hold
-- the "starting values". In our case, we need two:
declare @startingDate datetime;
declare @startingID int;

-- again, we want to returns resutls from row @a to row @b:
declare @a int;
declare @b int;

set @a = 200 -- start at row 200
set @b = 250 -- end at row 250

-- get the starting date and starting ID to return results:
set rowcount @a
select @StartingDate = DateCol, @startingID = ID
from yourtable
order by DateCol ASC,ID ASC

-- find out how many rows to return, and set the rowcount:
set @b = @b - @a
set rowcount @b

-- now return the results:
select * from yourtable
where
DateCol > @StartingDate OR
(DateCol = @StartingDate AND ID > @StartingID)
order by DateCol ASC, ID ASC

-- clean up:
set rowcount 0
 
0

Featured Post

Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down …
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.

622 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question