Solved

What's the best way to optimise queries on a single table? SQL Server

Posted on 2013-11-14
15
486 Views
Last Modified: 2013-11-18
Whilst I am an experienced in databases, and am not familiar with targeted optimization techniques. I will in due course need to learn(!) however I really only have one point which needs optimizing, and I'm guessing is the simplest of all cases, querying a single table. Here is the table structure:

CREATE TABLE [dbo].[web_site_styles](
      [ws_id] [int] NOT NULL,
      [website_style] [varchar](60) NULL,
      [style_value] [varchar](60) NULL
) ON [PRIMARY]

.. and I am querying it like this:

SELECT website_style, style_value WHERE ws_id=234

ws_id is NOT unique, and this query should return multiple rows (around 200). The table can contain several million rows, and is not updated frequently so think it is an ideal candidate for indexing? Can someone provide me with all the ways I can optimize the querying of this table. I am not only interested in database structure improvements, would stored procedures help here for example? I am literally calling the above SQL with a single parameter for the ws_id column.

Thanks in advance
0
Comment
Question by:paddycobbett
  • 5
  • 5
  • 3
  • +2
15 Comments
 
LVL 32

Expert Comment

by:Big Monty
Comment Utility
here's a good article covering query optimization that may shed some light on what you're trying to do:

http://technet.microsoft.com/en-us/magazine/2007.11.sqlquery.aspx
0
 
LVL 34

Expert Comment

by:Brian Crowe
Comment Utility
CREATE NONCLUSTERED INDEX IX_web_site_styles_ws_id
ON web_site_styles(ws_id)
INCLUDE (website_style, style_value)
0
 
LVL 26

Expert Comment

by:Zberteoc
Comment Utility
While the ws_id is not unique it is recommendable to have a unique value as PK on your table I would suggest this:

-- create a new table with an identity column
select 
	identity(int, 1,1) as web_site_styles_id, 
	ws_id,  
	website_style, 
	style_value 
into [dbo].[web_site_styles_new]
from [dbo].[web_site_styles]
go

-- add a PK to the new table on the identity column
ALTER TABLE [web_site_styles_new]
ADD PRIMARY KEY (web_site_styles_id) 
GO

-- add a nonclustered index on ws_id
create nonclustered index ix_web_site_styles_ws_id on web_site_styles_new(ws_id) 
	include  (website_style, style_value)
GO

-- rename original table to _old
sp_rename web_site_styles, web_site_styles_old
GO

-- rename the new table to original table name
sp_rename web_site_styles_new, web_site_styles
GO

-- later if anything OK:
-- drop table web_site_styles_old

Open in new window

0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 450 total points
Comment Utility
Cluster the table first on ws_id, 100%, NO question about it.

You can add an identity after that column to make it unique if you really want to, but it's not necessary.

Is ws_id a naturally, continually ascending value (at least 95% of the time)?  If so, you're pretty much good.  Rebuild the table when it does eventually get too fragmented, as with any other table.

If ws_id is not naturally ascending, you may need to rebuild the table more often.

Optimally, you'll have Enterprise Edition and be able to rebuild ONLINE.
0
 
LVL 1

Author Comment

by:paddycobbett
Comment Utility
Thanks for all the comments from everyone, just wanted some clarification on the last one. The ws_id will always be ascending, i.e any new records will be an increment of the last few records so (from i've been reading) it does make sense for it to be clustered since new rows should always physically reside after old rows. To repeat, ws_id is NOT unique, although their may be many rows with the same ws_id (added at the same time). I don't like the idea of having to ever rebuild the table, however does my circumstance suggest it won't ever be necessary? Even if I delete rows?

I don't like the idea of adding a unique column since it provides no logical relationship to anything else, however if it improves performance then I would consider it.. however your comment suggests it won't.

Given my comments, please reassure me that this is the way to go.. or if anyone else could suggest why it is not the best for my case.

Thanks a million again!
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
An IDENTITY column will add overhead for no real value.  Again, I recommend against it.

SQL Server will handle the duplicate key values itself.  You just specify the logical key you want:

-- uncomment "ONLINE = ON," if you are on Enterprise+ Edition:

CREATE CLUSTERED INDEX web_site_styles__CL ON dbo.web_site_styles ( ws_id ) WITH ( FILLFACTOR = 99, /* ONLINE = ON,*/ SORT_IN_TEMPDB = ON ) ON [PRIMARY]
0
 
LVL 26

Expert Comment

by:Zberteoc
Comment Utility
"An IDENTITY column will add overhead for no real value.  Again, I recommend against it."

How do you explain then the general recommendation to HAVE a unique PK on evry table?
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 450 total points
Comment Utility
As a general recommendation.  In this specific case, IDENTITY is not necessary and, as it always does, obviously causes INSERTs to be forcibly serialized/synchronous.

I 100% guarantee that performance will be vastly better for this table with "ws_id" as the clustering key vs. a dopey IDENITY column.

If you really want identity as PK here, fine, add it, but only as a nonclustered index (or just don't index it at all; there's no danger to that if you never back up the identity value).

As I've stated many times before, the single biggest myth about performance is that an identity should be the "default" clustering key.  Wrong -- the clustering key is the most critical aspect of table performance, and thus should never be "defaulted", but instead based on careful analysis of that specific table's needs.

Since in this case ws_id itself is sequential, it's absolutely ridiculous to add an artificial identity as the clustering key.

The "general rule" overweights INSERT vs SELECT activity too.  INSERT occurs only once; generally rows are SELECTed 10 times or more after being INSERTed.  Especially given the capability of online rebuilds, it makes far sense to pick the best clustering key for overall performance rather than fixating on the INSERT overhead.
0
 
LVL 26

Expert Comment

by:Zberteoc
Comment Utility
Scott, I don't argue but a PK has to be unique. In this case there is no unique value unless you make a full columns composite key, which actually doesn't guarantee uniqueness.
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
So what?  A table's not required to have a PK.  Yes, relational theory "demands" a PK, but this is the real world; a manufactured PK is not worth the overhead here.  Why harm performance that much just to be able to say that "the table has a PK"?
0
 
LVL 26

Expert Comment

by:Zberteoc
Comment Utility
How an identity PK harms the performance?
0
 
LVL 1

Author Comment

by:paddycobbett
Comment Utility
Whilst I have stated that I am not a database expert, I am a software expert, and for me logically, I never introduce relationships which have no logical relationship. I totally support having tables with unique IDs when the table describes whole entities which need to be referenced, however this table actually holds grouped elements, which are logically associated with a "whole" which has a PK, which is the web_site_style table (as opposed to web_site_styles table), which also has the ws_id column however in this case it is unique. I would be surprised to find out that adding additional columns with no logical relationships would help, since I would never need to join on that column. I might for example say:

select * from web_site_style ws
join web_site_styles wss on (ws.ws_id=wss.ws_id)
on ws.ws_id=123

Surely I don't need a PK on the web_site_styles table in this case? I am not in my realm so can't be 100% sure of anything!
0
 
LVL 26

Assisted Solution

by:Zberteoc
Zberteoc earned 50 total points
Comment Utility
The PK on a column is not there for the relation purpose only but also to ensure that a row is uniquely defined.

What if you wanted to modify website_style or style_value column? Is the combination of the 3, including ws_id, unique? It should be but then every time you will have to do un update you will have to filter on all 3 columns and probably to create an index on all 3 as well, which it will be the PK in that case.

If you don't need/want a PK on that table then a nonclustered index on ws_id is all you need. I don't see what's the dilemma with performance here.
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
>> logically, I never introduce relationships which have no logical relationship. I totally support ... unique IDs ... which need to be referenced <<

[Emphasis added.]  Exactly correct.

Too many people have absorbed this myth that basically all tables should have an identity, esp. if you can't assign a PK up front otherwise.  Again, that the most damaging myth in table design.



>> I don't see what's the dilemma with performance here. <<

1) Identity values must be assigned sequentially.  That forces all INSERTs to be done sequentially, preventing simultaneous INSERTs into the table.

2) All processing a computer does has overhead, including generating and assigning a unique # to every row.
0
 
LVL 1

Author Comment

by:paddycobbett
Comment Utility
Zberteoc, you make a good point that if I needed to update individual web_site_style entries then it would be great to update against a primary key, however I don't. The table will have a bunch of entries added together with the same ws_id, or deleted together referenced by the same ws_id. The way I see it is that is holds a collection of elements with no particular order. Thanks a lot for all the comments to my question, has been interesting!
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Introduction SQL Server Integration Services can read XML files, that’s known by every BI developer.  (If you didn’t, don’t worry, I’m aiming this article at newcomers as well.) But how far can you go?  When does the XML Source component become …
SQL Server engine let you use a Windows account or a SQL Server account to connect to a SQL Server instance. This can be configured immediatly during the SQL Server installation or after in the Server Authentication section in the Server properties …
Via a live example, show how to setup several different housekeeping processes for a SQL Server.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now