Link to home
Start Free TrialLog in
Avatar of paddycobbett
paddycobbett

asked on

What's the best way to optimise queries on a single table? SQL Server

Whilst I am an experienced in databases, and am not familiar with targeted optimization techniques. I will in due course need to learn(!) however I really only have one point which needs optimizing, and I'm guessing is the simplest of all cases, querying a single table. Here is the table structure:

CREATE TABLE [dbo].[web_site_styles](
      [ws_id] [int] NOT NULL,
      [website_style] [varchar](60) NULL,
      [style_value] [varchar](60) NULL
) ON [PRIMARY]

.. and I am querying it like this:

SELECT website_style, style_value WHERE ws_id=234

ws_id is NOT unique, and this query should return multiple rows (around 200). The table can contain several million rows, and is not updated frequently so think it is an ideal candidate for indexing? Can someone provide me with all the ways I can optimize the querying of this table. I am not only interested in database structure improvements, would stored procedures help here for example? I am literally calling the above SQL with a single parameter for the ws_id column.

Thanks in advance
Avatar of Big Monty
Big Monty
Flag of United States of America image

here's a good article covering query optimization that may shed some light on what you're trying to do:

http://technet.microsoft.com/en-us/magazine/2007.11.sqlquery.aspx
CREATE NONCLUSTERED INDEX IX_web_site_styles_ws_id
ON web_site_styles(ws_id)
INCLUDE (website_style, style_value)
While the ws_id is not unique it is recommendable to have a unique value as PK on your table I would suggest this:

-- create a new table with an identity column
select 
	identity(int, 1,1) as web_site_styles_id, 
	ws_id,  
	website_style, 
	style_value 
into [dbo].[web_site_styles_new]
from [dbo].[web_site_styles]
go

-- add a PK to the new table on the identity column
ALTER TABLE [web_site_styles_new]
ADD PRIMARY KEY (web_site_styles_id) 
GO

-- add a nonclustered index on ws_id
create nonclustered index ix_web_site_styles_ws_id on web_site_styles_new(ws_id) 
	include  (website_style, style_value)
GO

-- rename original table to _old
sp_rename web_site_styles, web_site_styles_old
GO

-- rename the new table to original table name
sp_rename web_site_styles_new, web_site_styles
GO

-- later if anything OK:
-- drop table web_site_styles_old

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of paddycobbett
paddycobbett

ASKER

Thanks for all the comments from everyone, just wanted some clarification on the last one. The ws_id will always be ascending, i.e any new records will be an increment of the last few records so (from i've been reading) it does make sense for it to be clustered since new rows should always physically reside after old rows. To repeat, ws_id is NOT unique, although their may be many rows with the same ws_id (added at the same time). I don't like the idea of having to ever rebuild the table, however does my circumstance suggest it won't ever be necessary? Even if I delete rows?

I don't like the idea of adding a unique column since it provides no logical relationship to anything else, however if it improves performance then I would consider it.. however your comment suggests it won't.

Given my comments, please reassure me that this is the way to go.. or if anyone else could suggest why it is not the best for my case.

Thanks a million again!
An IDENTITY column will add overhead for no real value.  Again, I recommend against it.

SQL Server will handle the duplicate key values itself.  You just specify the logical key you want:

-- uncomment "ONLINE = ON," if you are on Enterprise+ Edition:

CREATE CLUSTERED INDEX web_site_styles__CL ON dbo.web_site_styles ( ws_id ) WITH ( FILLFACTOR = 99, /* ONLINE = ON,*/ SORT_IN_TEMPDB = ON ) ON [PRIMARY]
"An IDENTITY column will add overhead for no real value.  Again, I recommend against it."

How do you explain then the general recommendation to HAVE a unique PK on evry table?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Scott, I don't argue but a PK has to be unique. In this case there is no unique value unless you make a full columns composite key, which actually doesn't guarantee uniqueness.
So what?  A table's not required to have a PK.  Yes, relational theory "demands" a PK, but this is the real world; a manufactured PK is not worth the overhead here.  Why harm performance that much just to be able to say that "the table has a PK"?
How an identity PK harms the performance?
Whilst I have stated that I am not a database expert, I am a software expert, and for me logically, I never introduce relationships which have no logical relationship. I totally support having tables with unique IDs when the table describes whole entities which need to be referenced, however this table actually holds grouped elements, which are logically associated with a "whole" which has a PK, which is the web_site_style table (as opposed to web_site_styles table), which also has the ws_id column however in this case it is unique. I would be surprised to find out that adding additional columns with no logical relationships would help, since I would never need to join on that column. I might for example say:

select * from web_site_style ws
join web_site_styles wss on (ws.ws_id=wss.ws_id)
on ws.ws_id=123

Surely I don't need a PK on the web_site_styles table in this case? I am not in my realm so can't be 100% sure of anything!
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>> logically, I never introduce relationships which have no logical relationship. I totally support ... unique IDs ... which need to be referenced <<

[Emphasis added.]  Exactly correct.

Too many people have absorbed this myth that basically all tables should have an identity, esp. if you can't assign a PK up front otherwise.  Again, that the most damaging myth in table design.



>> I don't see what's the dilemma with performance here. <<

1) Identity values must be assigned sequentially.  That forces all INSERTs to be done sequentially, preventing simultaneous INSERTs into the table.

2) All processing a computer does has overhead, including generating and assigning a unique # to every row.
Zberteoc, you make a good point that if I needed to update individual web_site_style entries then it would be great to update against a primary key, however I don't. The table will have a bunch of entries added together with the same ws_id, or deleted together referenced by the same ws_id. The way I see it is that is holds a collection of elements with no particular order. Thanks a lot for all the comments to my question, has been interesting!