What's the best way to optimise queries on a single table? SQL Server

Whilst I am an experienced in databases, and am not familiar with targeted optimization techniques. I will in due course need to learn(!) however I really only have one point which needs optimizing, and I'm guessing is the simplest of all cases, querying a single table. Here is the table structure:

CREATE TABLE [dbo].[web_site_styles](
      [ws_id] [int] NOT NULL,
      [website_style] [varchar](60) NULL,
      [style_value] [varchar](60) NULL

.. and I am querying it like this:

SELECT website_style, style_value WHERE ws_id=234

ws_id is NOT unique, and this query should return multiple rows (around 200). The table can contain several million rows, and is not updated frequently so think it is an ideal candidate for indexing? Can someone provide me with all the ways I can optimize the querying of this table. I am not only interested in database structure improvements, would stored procedures help here for example? I am literally calling the above SQL with a single parameter for the ws_id column.

Thanks in advance
Who is Participating?
Scott PletcherConnect With a Mentor Senior DBACommented:
Cluster the table first on ws_id, 100%, NO question about it.

You can add an identity after that column to make it unique if you really want to, but it's not necessary.

Is ws_id a naturally, continually ascending value (at least 95% of the time)?  If so, you're pretty much good.  Rebuild the table when it does eventually get too fragmented, as with any other table.

If ws_id is not naturally ascending, you may need to rebuild the table more often.

Optimally, you'll have Enterprise Edition and be able to rebuild ONLINE.
Big MontySenior Web Developer / CEO of ExchangeTree.org Commented:
here's a good article covering query optimization that may shed some light on what you're trying to do:

Brian CroweDatabase AdministratorCommented:
CREATE NONCLUSTERED INDEX IX_web_site_styles_ws_id
ON web_site_styles(ws_id)
INCLUDE (website_style, style_value)
A proven path to a career in data science

At Springboard, we know how to get you a job in data science. With Springboard’s Data Science Career Track, you’ll master data science  with a curriculum built by industry experts. You’ll work on real projects, and get 1-on-1 mentorship from a data scientist.

While the ws_id is not unique it is recommendable to have a unique value as PK on your table I would suggest this:

-- create a new table with an identity column
	identity(int, 1,1) as web_site_styles_id, 
into [dbo].[web_site_styles_new]
from [dbo].[web_site_styles]

-- add a PK to the new table on the identity column
ALTER TABLE [web_site_styles_new]
ADD PRIMARY KEY (web_site_styles_id) 

-- add a nonclustered index on ws_id
create nonclustered index ix_web_site_styles_ws_id on web_site_styles_new(ws_id) 
	include  (website_style, style_value)

-- rename original table to _old
sp_rename web_site_styles, web_site_styles_old

-- rename the new table to original table name
sp_rename web_site_styles_new, web_site_styles

-- later if anything OK:
-- drop table web_site_styles_old

Open in new window

paddycobbettAuthor Commented:
Thanks for all the comments from everyone, just wanted some clarification on the last one. The ws_id will always be ascending, i.e any new records will be an increment of the last few records so (from i've been reading) it does make sense for it to be clustered since new rows should always physically reside after old rows. To repeat, ws_id is NOT unique, although their may be many rows with the same ws_id (added at the same time). I don't like the idea of having to ever rebuild the table, however does my circumstance suggest it won't ever be necessary? Even if I delete rows?

I don't like the idea of adding a unique column since it provides no logical relationship to anything else, however if it improves performance then I would consider it.. however your comment suggests it won't.

Given my comments, please reassure me that this is the way to go.. or if anyone else could suggest why it is not the best for my case.

Thanks a million again!
Scott PletcherSenior DBACommented:
An IDENTITY column will add overhead for no real value.  Again, I recommend against it.

SQL Server will handle the duplicate key values itself.  You just specify the logical key you want:

-- uncomment "ONLINE = ON," if you are on Enterprise+ Edition:

CREATE CLUSTERED INDEX web_site_styles__CL ON dbo.web_site_styles ( ws_id ) WITH ( FILLFACTOR = 99, /* ONLINE = ON,*/ SORT_IN_TEMPDB = ON ) ON [PRIMARY]
"An IDENTITY column will add overhead for no real value.  Again, I recommend against it."

How do you explain then the general recommendation to HAVE a unique PK on evry table?
Scott PletcherConnect With a Mentor Senior DBACommented:
As a general recommendation.  In this specific case, IDENTITY is not necessary and, as it always does, obviously causes INSERTs to be forcibly serialized/synchronous.

I 100% guarantee that performance will be vastly better for this table with "ws_id" as the clustering key vs. a dopey IDENITY column.

If you really want identity as PK here, fine, add it, but only as a nonclustered index (or just don't index it at all; there's no danger to that if you never back up the identity value).

As I've stated many times before, the single biggest myth about performance is that an identity should be the "default" clustering key.  Wrong -- the clustering key is the most critical aspect of table performance, and thus should never be "defaulted", but instead based on careful analysis of that specific table's needs.

Since in this case ws_id itself is sequential, it's absolutely ridiculous to add an artificial identity as the clustering key.

The "general rule" overweights INSERT vs SELECT activity too.  INSERT occurs only once; generally rows are SELECTed 10 times or more after being INSERTed.  Especially given the capability of online rebuilds, it makes far sense to pick the best clustering key for overall performance rather than fixating on the INSERT overhead.
Scott, I don't argue but a PK has to be unique. In this case there is no unique value unless you make a full columns composite key, which actually doesn't guarantee uniqueness.
Scott PletcherSenior DBACommented:
So what?  A table's not required to have a PK.  Yes, relational theory "demands" a PK, but this is the real world; a manufactured PK is not worth the overhead here.  Why harm performance that much just to be able to say that "the table has a PK"?
How an identity PK harms the performance?
paddycobbettAuthor Commented:
Whilst I have stated that I am not a database expert, I am a software expert, and for me logically, I never introduce relationships which have no logical relationship. I totally support having tables with unique IDs when the table describes whole entities which need to be referenced, however this table actually holds grouped elements, which are logically associated with a "whole" which has a PK, which is the web_site_style table (as opposed to web_site_styles table), which also has the ws_id column however in this case it is unique. I would be surprised to find out that adding additional columns with no logical relationships would help, since I would never need to join on that column. I might for example say:

select * from web_site_style ws
join web_site_styles wss on (ws.ws_id=wss.ws_id)
on ws.ws_id=123

Surely I don't need a PK on the web_site_styles table in this case? I am not in my realm so can't be 100% sure of anything!
ZberteocConnect With a Mentor Commented:
The PK on a column is not there for the relation purpose only but also to ensure that a row is uniquely defined.

What if you wanted to modify website_style or style_value column? Is the combination of the 3, including ws_id, unique? It should be but then every time you will have to do un update you will have to filter on all 3 columns and probably to create an index on all 3 as well, which it will be the PK in that case.

If you don't need/want a PK on that table then a nonclustered index on ws_id is all you need. I don't see what's the dilemma with performance here.
Scott PletcherSenior DBACommented:
>> logically, I never introduce relationships which have no logical relationship. I totally support ... unique IDs ... which need to be referenced <<

[Emphasis added.]  Exactly correct.

Too many people have absorbed this myth that basically all tables should have an identity, esp. if you can't assign a PK up front otherwise.  Again, that the most damaging myth in table design.

>> I don't see what's the dilemma with performance here. <<

1) Identity values must be assigned sequentially.  That forces all INSERTs to be done sequentially, preventing simultaneous INSERTs into the table.

2) All processing a computer does has overhead, including generating and assigning a unique # to every row.
paddycobbettAuthor Commented:
Zberteoc, you make a good point that if I needed to update individual web_site_style entries then it would be great to update against a primary key, however I don't. The table will have a bunch of entries added together with the same ws_id, or deleted together referenced by the same ws_id. The way I see it is that is holds a collection of elements with no particular order. Thanks a lot for all the comments to my question, has been interesting!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.