Link to home
Start Free TrialLog in
Avatar of Stef Merlijn
Stef MerlijnFlag for Netherlands

asked on

Performance of SQL statement

Hi,

Following structure I use to load records per page.
As it seems using "NOT IN" gives not the best performance, so I'm looking for an alternative.
Maybe via "NOT EXISTS" or "ROW_NUMBER() OVER ( ORDER BY IDField ) AS RowNumber"

What would be the best approach to replace the code below?

  With ADOQuery1 do
  begin
    Close;
    SQL.Clear;
    SQL.Add('SELECT TOP ' + IntToStr(GetRecords) + ' ' + Columnnames + ' FROM DBO.' + ViewName);
    SQL.Add('WHERE EIGENBEDRIJFNR = ' + IntToStr(vEigenBedrijfnr));
    SQL.Add('  AND ISVERWIJDERD = ' + IntToStr(ToonVerwijderd));
    SQL.Add('  AND (' + IDVeld + ' NOT IN ');
    SQL.Add('        (SELECT TOP ' + IntToStr(SkipRecords) + ' ' + IDField + ' FROM DBO.' + ViewName);
    SQL.Add('         WHERE ISVERWIJDERD = 0');
                      if (MyFilter <> '') then
                        SQL.Add(MyFilter);
    SQL.Add('         ORDER BY ' + OrderByFields + '))');
    if (MyFilter <> '') then
      SQL.Add(MyFilter);
    SQL.Add('ORDER BY ' + OrderByFields);
    Open;
    First;
  end;

Open in new window

Avatar of Vitor Montalvão
Vitor Montalvão
Flag of Switzerland image

Issue isn't with the NOT IN statement.
Are the fields used in the WHERE clause indexed?
How big is the table/view? And ORDER BY in big tables can take long to execute.
SOLUTION
Avatar of ste5an
ste5an
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Stef Merlijn

ASKER

@Vitor:
Are the fields used in the WHERE clause indexed?
Not many are indexed. There are quite a lot of fields in the table (+125). A user can filter or order the resultset on any of the fields?
How big is the table/view?
The table has in general 5k-10k records with +125 fields (not all fields are available for the enduser)

@St5an:
What has ROW_NUMBER to do with that question?
This could be used as an alternative way for paging. Something like:
WITH RelatiesTotaal ( Relatienr, Helenaam, IsVerwijderd, RowNumber ) 
  AS ( SELECT R1.Relatienr, R1.Helenaam, R1.IsVerwijderd,   
       ROW_NUMBER() OVER ( ORDER BY R1.Relatienr ) AS RowNumber 
       FROM dbo.Relaties AS R1                                
       WHERE R1.IsVerwijderd = 0
     ) 
     SELECT TOP (@RecordsPerPage)  
       R2.Relatienr, R2.Helenaam, R2.IsVerwijderd, R2.RowNumber 
    FROM RelatiesTotaal AS R2                                    
    WHERE R2.RowNumber BETWEEN ((@SkipPages * @RecordsPerPage) + 1) 
                           AND ((@SkipPages + 1) * @RecordsPerPage) 

Open in new window

What is the purpose of your TOP's?
Yes, it's solely used for paging.
What SQL Server version do you use?
New customers are all on SQL Server 2014 Express, but quite some existing customers are still on 2008 R2 Express
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Whay do you add MyFilter twice? this is more like a logic and is not related to the question.
Use ROW_NUMBER(), it's better than TOP.

btw, why don't you do the paging in your application?
check this:
Pagination Query, using the  OFFSET and FETCH   clauses  (new model, used from the SQL Server 2012+);

http://social.technet.microsoft.com/wiki/contents/articles/23811.paging-a-query-with-sql-server.aspx
@Zberteoc:
Why do you add MyFilter twice?
This because I need to exclude XX records from the same set of records, So both need to have the same filter.

@Vitor:
Eventually the issue is that you're using a View and that view might be a join of more tables?
Yes, this is the case. The view contains quite some joins.
Has those tables the proper PK and FK?
Yes they have.
And are those keys at least indexed?
Each PK has it's own index. The FK most of the time isn't indexed.
Is there a way to add the paging function into SQL Server 2008 R2?
Maybe the function can be added as a new function in the database???
Hi,

I did some testing, but there doesn't seem to be much difference between the two solution, even without having added some indexes for now.
If I had to say which one is slightly faster I would probably choose the "NOT IN"-query
DECLARE @RecordsToGet int
      , @CurrentPage int
	  , @Start int
	  , @End int
SELECT @RecordsToGet= 5000
     , @CurrentPage         = 1
SELECT @Start                = (@RecordsToGet * @CurrentPage) - @RecordsToGet
     , @End                  = (@Start + @RecordsToGet)
SELECT *
FROM (SELECT *, ROW_NUMBER() OVER (ORDER BY Functie, Bedrijfsnaam) AS RowNumber
	      FROM dbo.vwRelatiesTotaal
          WHERE IsVerwijderd = 0
            AND Voornaam LIKE '%e%'
	 ) AS tRelatiesTotaal
WHERE RowNumber > @Start AND RowNumber <= @End
ORDER BY Functie, Bedrijfsnaam
GO

Open in new window

DECLARE @GetRecords int
      , @SkipRecords int
SELECT @GetRecords = 5000, @SkipRecords = 0
SELECT TOP (@GetRecords) * FROM dbo.vwRelatiesTotaal
WHERE  IsVerwijderd = 0
  AND Voornaam LIKE '%e%'
AND Relatienr NOT IN (SELECT TOP (@SkipRecords) Relatienr 
                      FROM dbo.vwRelatiesTotaal
					  WHERE  IsVerwijderd = 0
                       AND Voornaam LIKE '%e%'
					  ORDER BY Functie, Bedrijfsnaam
					 )
ORDER BY Functie, Bedrijfsnaam

Open in new window

SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
you may need to try to use CTE, preload all data. use ,Net, etc
check
Comparing performance for different SQL Server paging methods
https://www.mssqltips.com/sqlservertip/2696/comparing-performance-for-different-sql-server-paging-methods/

try

Efficient Paging with Repeater and SQL Server 2005 / 2008
/optimized paging solution using the ASP.NET Repeater /
http://www.codeproject.com/Articles/301291/Efficient-Paging-with-Repeater-and-SQL-Server
Each PK has it's own index. The FK most of the time isn't indexed.
Here you go. That's a good starting point so please create the indexes for each single FK and then check again for the query performance.
The PK's are currently all NON-clustered indexes.
The FK's are currently not indexed at all.

What changes do I need to make exactly?
- Change all PK's from NON-Clustered to Clustered indexes?
- Should indexes for FK be NON-Clustered?
There's no rule that says that a PK should be clustered. Being clustered means that your data will be physically order as the cluster index and that's why you can have only a cluster index by table so choose it wisely.
FK's should be indexed since they are used in JOIN clauses.
- Change all PK's from NON-Clustered to Clustered indexes?
No. There is often an reason for this. Especially as a default primary key is created as clustered index. Thus without more information: Don't do this.

 - Should indexes for FK be Clustered or NON-Clustered?
The clustered index defines the physical table structure. Thus a clustered index in a table should support the main usage of that table. And it should consists of columns such that the key is unique, narrow and static. Foreign key columns are normally not narrow. So, it should be non-clustered.

But: Just creating an index for each foreign key does not generally improve query performance. Cause often more columns are required leading to an additional key lookup or - cause using the fk index and the key lookup can be an expensive operation - even in using table (pk) scans.

Imho: Don't add indices for foreign key relations just for fun.

Performance optimization is a pretty complex process. You should not start with that specific query. Gather a overview of your expensive (slow) queries first:

SELECT TOP 25
        GETDATE() AS snapshot_datetime ,
        RANK() OVER ( ORDER BY total_logical_reads + total_logical_writes DESC, sql_handle, statement_start_offset ) AS row_no ,
        creation_time ,
        last_execution_time ,
        total_worker_time / 1000 AS total_worker_time ,
        total_logical_reads ,
        total_logical_writes ,
        execution_count ,
        total_logical_reads + total_logical_writes AS total_logical ,
        CASE WHEN sql_handle IS NULL THEN ' '
             ELSE ( SUBSTRING(st.text, ( qs.statement_start_offset + 2 ) / 2,
                              ( CASE WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(NVARCHAR(MAX), st.text)) * 2
                                     ELSE qs.statement_end_offset
                                END - qs.statement_start_offset ) / 2) )
        END AS query_text ,
        st.text AS sql_text ,
        DB_NAME(st.dbid) AS database_name ,
        st.objectid AS object_id ,
        qp.query_plan ,
        cp.cacheobjtype ,
        cp.objtype ,
        qs.plan_handle
FROM    sys.dm_exec_query_stats qs
        INNER JOIN sys.dm_exec_cached_plans cp ON cp.plan_handle = qs.plan_handle
        CROSS APPLY sys.dm_exec_sql_text(sql_handle) st
        CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
WHERE   total_logical_reads + total_logical_writes > 0
ORDER BY total_logical DESC;

Open in new window


Does your query appear in this list?

Use the added actual execution plan. It sometimes contains an index suggestion.
yes, create clusters indexes

< Change all PK's from NON-Clustered to Clustered indexes?>
yes, if it is a good candidate

if you have PK as Non clustered
it is possible your DB was built on sql 7 when PK by default was created as Non Clustered..
There are almost no cases when you'd need to have PK as non clustered
it is possible - it could be your case ..
Still need clustered index for  table (s)  specially big tables.
BTW: how big your tables?
do you have indexes on the query used tables?
you may need to add not just clustered but "for search improvement" Non Cutered indexes

---

in any case -- start with DB maintenances :  run DB reindex, update stats, check DB....
it can help your query performance
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@St5an: Does your query appear in this list?
Not at first, but when I ran it again it was. The executionplan is huge, but values seem to be normal.
Is it useful to share it here? I could export it to *.XML or *.sqlplan?
Is it useful to share it here? I could export it to *.XML or *.sqlplan?
Yes and if possible as .sqlplan file.
@Vitor: Please see attachment.
MyExecutionplan.sqlplan
I just gave a fast look and could already see some table scans.
Also, your query is returning so many columns that I couldn't get the full SQL command. Do you really need all those columns to be returned?
Vitor is right.

You have many heap (non-clustered) tables.
The most expensive operation is the triple join of the Landen table.
Then you have some RID lookups, which are also expensive.

So I would start by looking at all involved tables and create clustered indices where they are missing.

When possible: Post also the entire SQL statement.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@Ste5an:
The most expensive operation is the triple join of the Landen table.
Is there another way to accomplish this? As far as I can see the only way would be to store addresses in a separate table (basically normalizing the data a bit more).
Then you have some RID lookups, which are also expensive.
What are these and what can be done about it?

@Geert Gruwez:
What components for the visual and what components for the database?
- DevExpress QuantumGrid in GridMode (with sorting capabilities)
- ADO query
And for performance reasons ... use bind variables ! every query your app gives to the database has to be processed as it's a new query
I don't understand that the Format-way will increase performance as I think changing the parameters will result in the database having to process it as new query anyway. Can you clearify why the format will perform better?
Get rid of the order by in the query
The ORDER BY is used to allow sorting data in the grid. The fields in the ORDER BY are changed when a user wants to sort any (or multiple) column(s). The same goes for filtering data.
I cannot tell this. Cause it originates from the used queries. But let's face it: imho there are a lot of different points to tackle..

What are these and what can be done about it?
So I would start by looking at all involved tables and create clustered indices where they are missing.
Devexpress sorts the data in memory if you click a column header
 Filtering can also be in memory

No real point in asking the db to sort first
Why do you use a top n subselect without sorting ?
 It will seem as random records are removed from the selection

I'd make the not in optional too

There are not many bindvariables here
Your optional where calaise should have them too
The ORDER BY is used to allow sorting data in the grid. The fields in the ORDER BY are changed when a user wants to sort any (or multiple) column(s).
Having a grid it's always better to use the grid object for sorting the data rather the SQL Server engine. You'll save plenty of time only with this little change.
He has an ORDER BY. It's optional and buried in the SQL concatenation.
@Geert Gruwez:
Devexpress sorts the data in memory if you click a column header
Not in Gridmode as it can only sort and filter records that are actually loaded into the grid.

Why do you use a top n subselect without sorting ?
I don't, but if it not there in some code in this post it is forgotten.
Why do you use a top n subselect without sorting ?
I don't
Uhm ! Yes you do !

you even use parameter SkipRecords to skip a number of random records
Delphiwizard, any feedback for us?
Cheers
@Geert Gruwez:
Uhm ! Yes you do !. No I don't :-)
Please see also my initial post. "MyFilter"  and "Order By" are used at both sides.
  With ADOQuery1 do
  begin
    Close;
    SQL.Clear;
    SQL.Add('SELECT TOP ' + IntToStr(GetRecords) + ' ' + Columnnames + ' FROM DBO.' + ViewName);
    SQL.Add('WHERE EIGENBEDRIJFNR = ' + IntToStr(vEigenBedrijfnr));
    SQL.Add('  AND ISVERWIJDERD = ' + IntToStr(ToonVerwijderd));
    SQL.Add('  AND (' + IDVeld + ' NOT IN ');
    SQL.Add('        (SELECT TOP ' + IntToStr(SkipRecords) + ' ' + IDField + ' FROM DBO.' + ViewName);
    SQL.Add('         WHERE ISVERWIJDERD = 0');
                      if (MyFilter <> '') then
                        SQL.Add(MyFilter);
    SQL.Add('         ORDER BY ' + OrderByFields + '))');
    if (MyFilter <> '') then
      SQL.Add(MyFilter);
    SQL.Add('ORDER BY ' + OrderByFields);
    Open;
    First;
  end;

Open in new window

now i see it !
4 eyes principle ... next time i'll ask the one next to me to confirm my eyesight is not fooling me
Delphiwizard, do you still need help with this question?
Basically performance of different SQL solution do no differ very much, even after adding some indexes. Therefore I've decided to go with DevArt SDAC components. Also I wil implement using servermode for DevExpress Quantumgrid.