Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

What are the performance implications of using derived tables?

Posted on 2004-10-09
10
944 Views
Last Modified: 2012-05-05
In the ASP.NET section of experts exchange, I have asked a question about how to get the (for example) 200th through 250th row of a result set in a MSSQL database.  The actual table is very large, and has duplicate key values, so I am struggling with how to page through it.  

I believe that temporary tables (create a table with an identity value) are too high overhead, because the base table has 4 million records.

The following query has been suggested.  I am not familiar with the DERIVED TABLE syntax, and I am not sure what the performance implications would be.  

Can you please comment on how efficient this query would be on a 4M row table, and suggest a better method if you know of one?

Here's the query:

SELECT * FROM(SELECT TOP 50 * FROM (SELECT TOP 250 * FROM (SELECT TOP 250 * FROM tablename ORDER BY field) DERIVEDTBL ORDER BY field DESC) DERIVEDTBL) DERIVEDTBL ORDER BY field

The actual query would have where clauses in it to retreive only rows that match a particular key or key range, but there could be thousands of records returned.  I am trying to avoid a several thousand row result set for performance reasons.

Will the query optimizer figure out what this means and do the right thing, or will it it create two temporary tables, and if so, should I worry about the overhead of doing this?

Thanks,

Kevin
0
Comment
Question by:kevinmackenzie
  • 4
  • 3
  • 2
  • +1
10 Comments
 
LVL 6

Expert Comment

by:robertjbarker
ID: 12267696
DERIVEDTBL is not a key word. It could just as well be "ABC" and the query you have would act just the same.

The first thing that is going to happen is that "(SELECT TOP 250 * FROM tablename ORDER BY field)" will be executed. From then on you are just working with 250 rows, and that is pretty small.

Getting that 250 rows is the thing to optimize, and the first thing to do there is to make sure "field" is indexed.  If it is not, you will do a full table scan to search through every value of that field.
0
 

Author Comment

by:kevinmackenzie
ID: 12268121
Thank you.  Is this syntax (without the DERIVEDTBL) the most efficient way to do this, or would creating a temporary table (more than one SQL statement) be better, or do you have another suggestion?

In other words, is this the best way to do this from a  performance standpoint, in your opinion?

Kevin
0
 
LVL 34

Expert Comment

by:arbert
ID: 12268879
Have you looked at the option of using ADO.NET paging?  Granted it has its own set of overhead and limitations, but it is an option.
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 

Author Comment

by:kevinmackenzie
ID: 12268891
On ADO.NET paging, yes I have looked at it.  ASP.NET apparently does not completely expose the interface, as the only thing I can find is a Dataset, which reads every row that satisfies the query into memory on the server, which is unacceptable on a large table.  I'm told that ASP offered better services in this regard, perhaps later versions of ASP.NET will catch up.

Thanks for the suggestion.

Kevin
0
 

Author Comment

by:kevinmackenzie
ID: 12268899
Razo,

I also found, when actually trying to use this syntax today, that I couldn't make it work in a complete application without knowing what the total number of records in the table is.  So using it requires two queries (not just two select statements).  The first, a COUNT(*) query, and the second the line shown previously in this question.  Why two?  Because the argument for the TOP clause cannot be a variable. So I have to do the first query, then build the string for the second SQL command.

So, for you database experts out there, how much overhead is there to a COUNT(*) operation?  I suspect it is a complete pass of the database, starting at the first indexed point in the table, and continuing to the upper bound of the WHERE clause.  That starts to make this method unacceptable...

Kevin
0
 
LVL 34

Accepted Solution

by:
arbert earned 500 total points
ID: 12268962
Take a quick look at the MS document on .NET paging (Sorry if you've already looked at it--not trying to waste your time).  Notice, in the document, they have a stored proc that runs database side to get the result set.  The only downside is the dynamic SQL.  However, I don't think that's any worse than running a query twice (once for the count, and again to use the count):

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnpag/html/scalenethowto05.asp

Brett
0
 
LVL 34

Expert Comment

by:arbert
ID: 12268963
(oh ya, you're right, the future is going to be much better with SQL2005)
0
 
LVL 6

Expert Comment

by:robertjbarker
ID: 12269244
What I meant was that the identifier for the intermediate results could be anything, for example:

SELECT * FROM(SELECT TOP 50 * FROM (SELECT TOP 250 * FROM (SELECT TOP 250 * FROM tablename ORDER BY field) ABC ORDER BY field DESC) ABC) ABC ORDER BY field

The identifiers still have to be there, or there is a syntax error.

Anyway, I am not an ASP kind of guy. But from an SQL viewpoint, I can't think of a better way to do what you want, as long as you have an index defined for each "field" you intend to use.
0
 
LVL 8

Expert Comment

by:plq
ID: 12269316
Something I found on another thread which might be useful for this

select top 250 *, identity(int, 1,1) 'fred'
into #temp
from asset

select * from #temp where fred > 200

drop table #temp


this returns rows 201 to 250


Personally I just use 'top' and cut out records 1 to 200 using rs.absoluteposition
0
 

Author Comment

by:kevinmackenzie
ID: 12271082
Arbert,

I had not found that help file, and it is exactly what I need.

For anyone else looking into this problem, the solution I eventually decided to use is located here:

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=27297

and is referred to as "Jeff's solution".

Here is his pseudo code:

-- for each column in your sort, you need a variable to hold
-- the "starting values". In our case, we need two:
declare @startingDate datetime;
declare @startingID int;

-- again, we want to returns resutls from row @a to row @b:
declare @a int;
declare @b int;

set @a = 200 -- start at row 200
set @b = 250 -- end at row 250

-- get the starting date and starting ID to return results:
set rowcount @a
select @StartingDate = DateCol, @startingID = ID
from yourtable
order by DateCol ASC,ID ASC

-- find out how many rows to return, and set the rowcount:
set @b = @b - @a
set rowcount @b

-- now return the results:
select * from yourtable
where
DateCol > @StartingDate OR
(DateCol = @StartingDate AND ID > @StartingID)
order by DateCol ASC, ID ASC

-- clean up:
set rowcount 0
 
0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction In my previous article (http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SSIS/A_9150-Loading-XML-Using-SSIS.html) I showed you how the XML Source component can be used to load XML files into a SQL Server database, us…
This article shows gives you an overview on SQL Server 2016 row level security. You will also get to know the usages of row-level-security and how it works
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question