Solved

MS SQL views are slow if they have many columns, regardless of the number in select statement calling the view

Posted on 2008-10-31
13
847 Views
Last Modified: 2012-05-05
It appears that a large number of columns in a view in MS SQL affects performance even if the query, when run against the view, contains only a couple of columns.  I've tested the exact same query against the tables directly and it takes a couple of seconds.  Select the same several columns from a view joining the tables with all columns, using the same criteria and it takes 2 minutes.  

 Why is this and any way to speed it up?

The joined view results in several hundred columns and about 50000 records.  Most of the columns are empty varchar fields.  I know the easy answer would be to run the select statement directly against the tables but there are a bunch of existing architectural reasons related to the app that would make that a problem.

Thanks for any feedback on this.
0
Comment
Question by:greymattersolutions
  • 6
  • 3
13 Comments
 
LVL 9

Expert Comment

by:jamesgu
Comment Utility
do you have a text field on the view or other binary, image... column(s),

you may need to provide more information on the table schema, view schema...
0
 

Author Comment

by:greymattersolutions
Comment Utility
Below are two queries that illustrate the performance difference.  It's a bit tought to replicate as the difference in speed grows incrementally as the size of the tables grow.  With a few thousand total records, there is no appreciable speed difference.  As the tables get into the tens of thousands, the speed of the vCustomerEvents deteriorates dramatically compared to calling the sub views directly.

I've attached a create statement for the relevant tables.  I know it's not elegant but the structure has evolved based on changing requirements and is actually working well given the potential cost of restructuring.
/* Using the vCustomerEvents view it takes 1.5 minutes to return 1700 rows */
 

SELECT EventNum, EventNum, AssignedTo, OpenText7 , CreateDate , ReminderStatus , OpenText10 , Source, Type , Name , EventDate , EventStatus , OpenText13  FROM vCustomerEvents WHERE ( (Type <> 'Admin Adjustment') AND (Type <> 'Admin. Adjustment') AND (Type <> 'Admin. Various') AND (Type <> 'B-Day') AND (Type <> 'BOD meeting') AND (Type <> 'Budget') AND (Type <> 'Budget team meeting') AND (Type <> 'Building maintenance') AND (Type <> 'Clerical Support') AND (Type <> 'Data entry') AND (Type <> 'Finance') AND (Type <> 'Finance & HR') AND (Type <> 'Fundraiser') AND (Type <> 'Grant Reporting') AND (Type <> 'Sick day') AND (Type <> 'Vacation')) AND (((AssignedTo = 91.00) OR (AssignedTo = 77.00) OR (AssignedTo = 92.00) OR (AssignedTo = 93.00) OR (AssignedTo = 94.00) OR (AssignedTo = 182.00) OR (AssignedTo = 90.00) OR (AssignedTo = 89.00) OR (AssignedTo = 135.00) OR (AssignedTo = 95.00)) AND ((CASE WHEN IsDate(EventDate) = 1 THEN CAST(EventDate AS datetime) ELSE 0 END >= {d '2008-09-01'})) AND ((CASE WHEN IsDate(CASE WHEN IsDate(EventDate) = 1 THEN CAST(EventDate AS datetime) ELSE 0 END) = 1 THEN CAST(CASE WHEN IsDate(EventDate) = 1 THEN CAST(EventDate AS datetime) ELSE 0 END AS datetime) ELSE 0 END <= {d '2008-09-30'}))) Order By Type, AssignedTo, EventDate ASC 
 

/* Without the view, takes 14 seconds to return 1700 rows */
 

SELECT EventNum, EventNum, t.AssignedTo, t.OpenText7 , t.CreateDate , ReminderStatus , t.OpenText10 , Source, Type , Name , EventDate , t.Status , t.OpenText13  

FROM         dbo.vCustomers RIGHT OUTER JOIN

 vEventRecure_Staff t ON dbo.vCustomers.CustomerAllNum = t.CustomerAllNum

 WHERE ( (Type <> 'Admin Adjustment') AND (Type <> 'Admin. Adjustment') AND (Type <> 'Admin. Various') AND (Type <> 'B-Day') AND (Type <> 'BOD meeting') AND (Type <> 'Budget') AND (Type <> 'Budget team meeting') AND (Type <> 'Building maintenance') AND (Type <> 'Clerical Support') AND (Type <> 'Data entry') AND (Type <> 'Finance') AND (Type <> 'Finance & HR') AND (Type <> 'Fundraiser') AND (Type <> 'Grant Reporting') AND (Type <> 'Sick day') AND (Type <> 'Vacation')) AND (((t.AssignedTo = 91.00) OR (t.AssignedTo = 77.00) OR (t.AssignedTo = 92.00) OR (t.AssignedTo = 93.00) OR (t.AssignedTo = 94.00) OR (t.AssignedTo = 182.00) OR (t.AssignedTo = 90.00) OR (t.AssignedTo = 89.00) OR (t.AssignedTo = 135.00) OR (t.AssignedTo = 95.00)) AND ((CASE WHEN IsDate(EventDate) = 1 THEN CAST(EventDate AS datetime) ELSE 0 END >= {d '2008-09-01'})) AND ((CASE WHEN IsDate(CASE WHEN IsDate(EventDate) = 1 THEN CAST(EventDate AS datetime) ELSE 0 END) = 1 THEN CAST(CASE WHEN IsDate(EventDate) = 1 THEN CAST(EventDate AS datetime) ELSE 0 END AS datetime) ELSE 0 END <= {d '2008-09-30'}))) 

Order By Type, t.AssignedTo, EventDate ASC 

Open in new window

0
 

Author Comment

by:greymattersolutions
Comment Utility
Here is the create statement for the tables and views...
Large-View-Performance.txt
0
 

Author Comment

by:greymattersolutions
Comment Utility
Increasing the point value on this, mainly because it will take someone a while to review the tables and view, the're a bit involved :-)
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
From a quick look at your tables, it is not surprising it is taking a while, your tables are screaming to be normalized.  Also, you do realize that your Create tables violate the 8060 maximum in SQL Server 2000 and only allowed in SQL Server 2005 at some performance cost.
From BOL:
<quote>
Surpassing the 8,060-byte row-size limit might affect performance because SQL Server 2005 Database Engine still maintains a limit of 8 KB per page. When a combination of varchar, nvarchar, varbinary, sql_variant, or CLR user-defined type columns exceeds this limit, the Database Engine moves the record column with the largest width to another page in the ROW_OVERFLOW_DATA allocation unit, while maintaining a 24-byte pointer on the original page. Moving large records to another page occurs dynamically as records are lengthened based on update operations. Update operations that shorten records may cause records to be moved back to the original page in the IN_ROW_DATA allocation unit. Also, querying and performing other select operations, such as sorts or joins on large records that contain row-overflow data slows processing time, because these records are processed synchronously instead of asynchronously.

Therefore, when you design a table with multiple varchar, nvarchar, varbinary, sql_variant, or CLR user-defined type columns, consider the percentage of rows that are likely to flow over and the frequency with which this overflow data is likely to be queried. If there are likely to be frequent queries on many rows of row-overflow data, consider normalizing the table so that some columns are moved to another table. This can then be queried in an asynchronous JOIN operation.
</quote>

On a side note I would recommend you lose those multiple Type <> 'Something',Type <> 'Something else', etc and AssignedTo = Something, AssignedTo = SomethingElse, etc. and instead place them in table and index appropriately.  This will be far more efficient (and as a bonus far more maintainable).

Let me know, if you want to pursue that angle.
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 

Author Comment

by:greymattersolutions
Comment Utility
Hi acperkins, thanks for responding.

I hear you about normalizing, unfortunately, that's not really an option for those OpenText fields right now.  Your last point about the multiple criteria used, do you mean putting Type etc in a seperate table and linking with a key field?

The reference you quoted above doesn't explain why a View takes 140 seconds and the exact same sql, run directly on the table, takes 5 seconds.  Any ideas why that would be the case?  

Is the View executed differently?  My assumption is that when executing the view, MSSQL caches the entire row, even if those columns were not called in the select statement calling the view.  That would seem odd though, so I want to make sure there is not something else I'm missing or a setting that can change the execution.

Thanks
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
>>do you mean putting Type etc in a seperate table and linking with a key field?<<
Yes.  Also the AssignedTo field as well should go in in its own table.

>>Any ideas why that would be the case?  <<
No idea.
0
 

Author Comment

by:greymattersolutions
Comment Utility
It appears that this behavior has been fixed in SQL 2008.  
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 500 total points
Comment Utility
So I guess the correct answer to any question on SQL Server is this "this behavior has been fixed in SQL 2008." :)
0
 

Author Comment

by:greymattersolutions
Comment Utility
Based on all of our testing, the number of views and the indexing have nothing to do with the problem.  It is specifically related to the number of fields in the final view.  Again, running the same SQL directly takes a couple of seconds, running the view takes minutes.  Reducing the number of fields being returned eliminates the problem.  Since it doesn't seem to be a commonly understood issue, we'll accept "migrate to 2008" as the way to go :-)

Thanks for the help.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Introduction Hopefully the following mnemonic and, ultimately, the acronym it represents is common place to all those reading: Please Excuse My Dear Aunt Sally (PEMDAS). Briefly, though, PEMDAS is used to signify the order of operations (http://en.…
Introduction: When running hybrid database environments, you often need to query some data from a remote db of any type, while being connected to your MS SQL Server database. Problems start when you try to combine that with some "user input" pass…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now