Solved

Index fragmentation

Posted on 2010-11-08
61
845 Views
Last Modified: 2012-05-10
We have a table with a number of indexes inc a primary key (int). The table rows are being update a lot. Why would the Primary Key become fragmented? it's running at 61% but has been up to 98% fragmentation. I can understand why the other indexes would become fragmented but why the PK? Do you have any suggestions on how the fragmentation can be minimised?  We need the indexes to speed up the read part of the process.
Any pointers would be most welcome.....
0
Comment
Question by:naexpert
  • 27
  • 23
  • 9
  • +2
61 Comments
 
LVL 6

Expert Comment

by:subhashpunia
Comment Utility
PK is fragmented because of Page Splitting.  For more information please check this nice article:

http://blog.sqlauthority.com/2010/01/12/sql-server-fragmentation-detect-fragmentation-and-eliminate-fragmentation/
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
It all very much depends in the data your primary key is composed of, for example if you are using int identity(1,1) as your primary key and the clustered index is ontop of this, the index is less likely to fragment whereas if you were using a guid then new records could potentially be inserted anywhere in the table which would cause page splits and fragmentation unless you are using newsequentialguid.

Could you post the create scripts for your table / indexes it will be easier to determine why they are fragmenting.

Thanks
0
 

Author Comment

by:naexpert
Comment Utility
Hi,

Here are the create scripts for the table. I have not included all the fields in the table.


SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[EMailValidatorResults](
      [id] [int] IDENTITY(1,1) NOT NULL,
      [BeenProcessed] [bit] NOT NULL,
      [ModifiedTimeUTC] [datetime] NOT NULL,
      [NextCheckTimeUTC] [datetime] NULL,
      [BeingProcessed] [bit] NULL,
 CONSTRAINT [PK_EMailValidatorResults_id] PRIMARY KEY CLUSTERED
(
      [id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]

GO



CREATE NONCLUSTERED INDEX [IX_EMailValidatorResults_BeenProcessed] ON [dbo].[EMailValidatorResults]
(
      [BeenProcessed] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
GOUSE [DQEmail]
GO

CREATE NONCLUSTERED INDEX [IX_EMailValidatorResults_BeingProcessed] ON [dbo].[EMailValidatorResults]
(
      [BeingProcessed] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
GO


CREATE NONCLUSTERED INDEX [IX_EMailValidatorResults_NextCheckTime] ON [dbo].[EMailValidatorResults]
(
      [NextCheckTimeUTC] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
GO


0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Which indexes are getting fragmented?
0
 

Author Comment

by:naexpert
Comment Utility
PK_EMailValidatorResults_id 61%
BeenProcessed 79%
BeingProcessed 74%
NextCheckTimeUTC 81%

0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Ok, so when you say the a lot of table rows are being updated which data type are these updates taking place against and how do they change?

Also how many records are in your table? If there are not many then this may not even be an issue.
0
 

Author Comment

by:naexpert
Comment Utility
Hi,

There are currently 1.2 million rows in the table.

PK_EMailValidatorResults_id is a INT - doesn't change.
BeenProcessed is a BIT - Starts as 0 and changes to 1
BeingProcessed  is a BIT Starts as 0 and changes to 1 and then back to 0
NextCheckTimeUTC is a DateTime - and gets updated and then nulled when process on this row is completed.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Are there any shrink operations scheduled to happen on this database?
0
 

Author Comment

by:naexpert
Comment Utility
No, just a weekly backup of database.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
and the database definatly isnt set to auto_shrink.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Could you run the attatched query and post the results please?

select index_type_desc,index_id,index_depth,avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages from sys.dm_db_index_physical_stats(db_id(),null,null,null,null)

where object_name(object_id)='EMailValidatorResults'

Open in new window

0
 

Author Comment

by:naexpert
Comment Utility
Here are the results from your query.

index_type_desc,index_id,index_depth,avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages
CLUSTERED INDEX,1,4,2.31009794590235,23360,18.2603595890411
CLUSTERED INDEX,1,1,0,NULL,NULL
NONCLUSTERED INDEX,3,3,0.0656167979002625,68,22.4117647058824
NONCLUSTERED INDEX,4,3,1.31764705882353,84,25.297619047619
NONCLUSTERED INDEX,5,3,0,75,45.7066666666667
NONCLUSTERED INDEX,6,3,0,52,51.2692307692308
NONCLUSTERED INDEX,7,3,0.0340097494615123,221,39.9140271493213
NONCLUSTERED INDEX,8,3,0,5,987.2
NONCLUSTERED INDEX,9,3,19.3474264705882,856,5.08411214953271
NONCLUSTERED INDEX,10,3,0,3,888.666666666667
NONCLUSTERED INDEX,11,3,8.67383512544803,474,8.82911392405063
NONCLUSTERED INDEX,21,3,0.112443778110945,33,80.8484848484848
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Only one of your indexes is fragmented more than 2%. You want to be looking at the avg_fragmentation_in_percent column to see how fragmented your indexes are. Is this where you got your fragmentation percentages from before?
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
I have run a test of this based upon you db structure stated above and data change rate stated above and have been unable to get these indexes to fragment more than 2%-3%. I even inserted 1 million rows and updated 100k of them twice within a loop. You did mention that you had removed some of the other columns.

Unless these other columns are blank to begin with and then populated or change width i can see any reason why a lot of fragmentation would happen.
0
 

Author Comment

by:naexpert
Comment Utility
Sorry for the misleading numbers, between our conversation, I dropped the indexes and re-created them, that's why the percentages are so low. Prior to me doing this they were as stated before:-
PK_EMailValidatorResults_id 61%
BeenProcessed 79%
BeingProcessed 74%
NextCheckTimeUTC 81%
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Please could you read my previous post and comment. I havnt been able to get these indexes to fragment, do you mind telling me what data types the other columns are? And also how these columns are updated?

Thanks
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Why are you dropping and re-creating the indexes? The better way to do this (depending on fragmentation) would be to use ALTER INDEX [INDEX NAME or ALL] ON [TABLENAME] REBUILD
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
There is an "excellent" article

http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/A_691-Managing-Fragmentation-for-the-Accidental-DBA.html

Those nonclustered indexes are going to be all over the place, they seem to be the type that are regularly updated, and/or, very little uniqueness so might not even be doing too much.

If those BIT columns are used in conjunction with other data then might consider adding more to those indexes...

So, why is the PK being fragmented ? Read the article :)
0
 

Author Comment

by:naexpert
Comment Utility
Hi 'Evil',

I don't normally drop the indexes. I did this one time because I thought that dropping them before defraging the hard drive might be prudent. Anyway, as it happens, the data hard drive was hard fragmented so that was okay.

There are 10 non-clustered indexes plus the Clustered primary key which an int.

Here are the columns in the table.

      [id] [int] IDENTITY(1,1) NOT NULL,
      [extID] [nvarchar](250) NOT NULL,
      [EmA1] [nvarchar](318) NOT NULL,
      [BPro] [bit] NOT NULL,
      [NCorr] [bit] NOT NULL,
      [PSyn] [bit] NOT NULL,
      [DNaerver] [bit] NOT NULL,
      [HsMver] [bit] NOT NULL,
      [VBerver] [bit] NOT NULL,
      [Isal] [bit] NOT NULL,
      [CreationTimeUTC] [datetime] NOT NULL,
      [ModifiedTimeUTC] [datetime] NOT NULL,
      [SName] [nvarchar](150) NULL,
      [SerReonses] [nvarchar](4000) NULL,
      [Resode] [int] NULL,
      [Resonse] [nvarchar](512) NULL,
      [Local] [nvarchar](64) NOT NULL,
      [Doin] [nvarchar](254) NOT NULL,
      [Catery] [nvarchar](64) NOT NULL,
      [List1] [bit] NOT NULL,
      [List2] [bit] NOT NULL,
      [DurationMSec] [int] NULL,
      [NextCheckTimeUTC] [datetime] NULL,
      [Source] [nvarchar](255) NOT NULL,
      [Analysis] [nvarchar](255) NULL,
      [ExcDetected] [bit] NOT NULL,
      [ExcDump] [nvarchar](4000) NULL,
      [RetCnt] [int] NOT NULL,
      [BeingProcessed] [bit] NULL,
      [ProcessingStartTimeUTC] [datetime] NULL,
      [Resolution] [nvarchar](255) NULL,
      [Title1] [nvarchar](100) NULL,
      [First1] [nvarchar](100) NULL,
      [Last1] [nvarchar](100) NULL,
      [Url] [nvarchar](250) NULL,
      [Index] [int] NULL,
      [IndexLength] [int] NULL,
      [NameCong1] [bit] NULL,
      [InitialCong] [bit] NULL,
      [NameCong2] [bit] NULL,
      [InitialCong] [bit] NULL,
      [Expansion] [bit] NULL,
      [Web] [bit] NULL,
      [Duplicate] [bit] NULL,
      [ResultCode] [int] NULL,
      [ResultCodeDesc] [nvarchar](250) NULL,
      [Entity] [nchar](2) NULL,
      [ResolutionCode] [int] NULL,
      [kTransactions] [int] NULL,
      [ResultScore] [int] NULL,
      [ClientCode] [nchar](4) NULL,
      [JobUid] [uniqueidentifier] NULL,
      [ECR] [nchar](32) NULL,
      [CorporateName] [nvarchar](250) NULL,
      [SICCode] [nvarchar](50) NULL,
      [IsOrg] [bit] NULL,
      [PK_Reference] [int] NULL,
      [IsMaster] [bit] NULL,
      [IsGeneric] [bit] NULL,
      [IndiscriminateResult] [bit] NULL,
      [Code1] [nchar](5) NULL,
      [CodeDesc1] [nvarchar](500) NULL,
      [IsSpap] [bit] NULL,

The process uses bog standard update statements in the code.

Thanks so much for taking the time to look into this for me.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Ok I have a few questions now.

1) is the clustered index built exactly how you stated above (just has the id field).
2) During the process of updating after the initial insert does the length of any of these nvarchar / varchar fields change?
3) How bothered about space are you......?
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
If the hard drive was fragmented, what did you do next ?

Did you clean up the database - doing a full backup, reindex, set size - that type of thing ?

Is it set to autogrow (or course it is) but in terms of it is always continually trying to grow it self, or has the size of the database been established to minimize autogrow ?

When was the last time you did a reindex ?

Do you run regular maintenance plans - like a nightly backup and a wekkly index check ?

Do you have the time window to faciliate the above (or is it used 24x7 making life "difficult") ?

Do you read my article yet ?
0
 

Author Comment

by:naexpert
Comment Utility
Hi Mark,

Thanks for your post regarding your 'Excellent' article, very good reading. I can now see why setting the initial size of the database is so important. I must be missing something fundamental though. If the identity was created when the table was created I would have thought that that clustered index would not become fragmented, but it does, why is that?
0
 

Author Comment

by:naexpert
Comment Utility
Hi 'Evil'

1) is the clustered index built exactly how you stated above (just has the id field). Yes.
2) During the process of updating after the initial insert does the length of any of these nvarchar / varchar fields change? No.
3) How bothered about space are you......? Not too bothered.

Thanks again for your help.
0
 
LVL 16

Accepted Solution

by:
EvilPostIt earned 250 total points
Comment Utility
Here is what i think is happening with your table.

As the nonclustered dont have a huge amount of uniqueness these will be getting fragmented very easily. BIT columns only have 2 possible values and therefore will be shifting about everytime they change. The datetime time one less so although from what you have said they are left null in the end (which i assume is the case for most of them after being processed). Non clustered indexes are effectivly seperate from the table itself and therefore the only thing that can effect the fragmentation is a change to the ordering of the data which you have indexed (in you case the individual BIT columns).

The clustered index is the one which i was finding a bit staneg. That was until you gave me the other columns. Here is some background around the SQL Server storage engine.....

The way that SQL Server stores data are referred to as pages. These pages are 8k in size (Thus the 8k ish row limit, unless you are using blobs.) When you insert data these pages are populated with the row by row data. The way that clustered indexes work is that it applies an order to the tables actual data. Which mean that a clustered index is the table itself order in the manner that you have specified. This is different to non-clustered indexes as above.

So lets say for example each of you rows has the potential to be 4000 bytes (rough estimate from just glancing a you create table statement). But the actual data upon inserting is 1000 bytes.

That means that you will have roughly 8 rows per page. Now let say that you update one of these rows and add additional data to the nvarchar / varchar columns and make one of the records 2000 bytes. This means that you are now trying to fit 9000 bytes into an 8192 byte page. Obviously this is not possible, and what happens now is called a page split. Your nice pretty single page with the data all aligned nicely has to be split into 2 pages and the data split between them. This is where the fragmentation happens as these new pages are allocated by the storage engine (If you are interested how this happens look up SQL Server PFS pages in google).

I think you can guess why i was asking about the change to the NVARCHAR / VARCHAR colums now.

The reason why i was asking "Are you bothered about space?" is as follows. If you were to set you NVARCHAR / VARCHAR columns to NCHAR / CHAR then this would overcome this issue. The reason behind this is that when CHAR / NCHAR data is inserted it will allocate the maximum size for the colunm in the row upon insertion. Therefore it does not matter if the length of the data changes, as the width of the row within the page will not change as it has already been allocated. But obviously as with every solution there is a down side. The downside in this solution is that you will be taking up more space up with blank data. Ie CHAR(1000) which has a value of 'HELLO' would mean that there are 995 characters of empty space. Although this then goes back to database design and using the correct data type lengths.

Phew!!!!

Hope this help.





0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Just read your post responding to my questions. Ok well that idea is out of window then. Back to the drawing board then.
0
 

Author Comment

by:naexpert
Comment Utility
Hi 'Evil'

I misunderstood the middle question, or some reason I was thinking of the field length not the insertion length:-

Your question was:- During the process of updating after the initial insert does the length of any of these nvarchar / varchar fields change? The answer is yes, as we are updating some nvarchar fields with a lot of data. So your previous explanation does make sense.

We really don't want to re-engineer the software processes that update this table.

Thanks again.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
This isnt really a re-engineer its more of a tweak for performance. As all you are effectivly changing is the way in which the data is being stored at the pages level.

VARCHAR(200) - CHAR(200)
NVARCHAR(200) - NCHAR(200)

Due to the nature of your application the fragmentation will keep happening and with higher volumes of data going through may just keep getting worse.

That said though what you could do is rebuild the index in a quite period regularly with say an 80% fill factor. This may help.
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
Nice posts EvilPostIt, and a warm welcome to EE by the way :)

Just following on a bit... A clustered index does not gaurantee that there will be no fragmentation, a rebuild will try to use contiguous pages and eliminate that fragmentation, but there are quite a few different scenarios at play...

EvilPostIt talks about large data, and with a few bytes of overheads those 1000bytes might consume 1001 bytes, or there is a fill factor reserving some empty space, so instead of 8 rows per page, there is only seven, but large gaps, or empty space in between and available for use by almost anything else. So, another small row comes along and fits neatly into that space. And so it goes...

It is very difficult when working with such potentially large rows, if they are intitially kinda empty and then updated with some volume then it no longer fits into its current allocation. So, it puts a pointer to where the new data now lives. Well a clustered index keeps the data on the leaf nodes, so guess what, it is now fragmented because of a simple update.

The table is a big long table with things like code1 and codedesc1 - so guessing it is used for some kind of datawarehouse ?

Are the columns really unicode ? ie nvarchar ?



0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
Also, if you could please spare a moment to consider some of the previous questions, might be able to come up with some good strategies for you...
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Thanks mark,

Wasnt sure if you were going to be the type to pick appart my post where ever possible as im sure there are a few missunderstandings there :) thanks for not doing so.

0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
With regards to fill factor bear in mind this fill factor is only maintained at index creation / rebuild time.
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
@EvilPostIt, enjoy seeing more experts who know what they are talking about... It is more about collaboration and helping the "asker". And the occasional point or three of course *laughing*
0
 

Author Comment

by:naexpert
Comment Utility
Hi Mark,

Here are the answers to your questions:-

If the hard drive was fragmented, what did you do next ? I dropped all the indexes on the database/table in question, I then stopped all the SQL Services, I then used Windows 2008 own Defrag tool. As it happend, there was 7% fragmentation which  i don't think is too bad. Carried out defrag anyway. Then restarted SQL services, then re-created all the indexes, clustered index was first one created. I have also set the Recovery Model from Full to Simple thinking that the logs won't get so big.

Did you clean up the database - doing a full backup, reindex, set size - that type of thing ? The last backup was Sunday just gone. Re-created all indexes with fill factor of 80. Set size? do you mean the database size?

Is it set to autogrow (or course it is) but in terms of it is always continually trying to grow it self, or has the size of the database been established to minimize autogrow ? It was set to Autogrow (File Growth 10mb and Max File Size = Unrestricted) I have since increased the File Growth to 100mb, not sure if this will help at all?

When was the last time you did a reindex ? Before I dropped and re-created indexes, the indexes were rebuilt on Monday. We are having to do this every other day as some of the indexes are running up to 80-90% fragmented.

Do you run regular maintenance plans - like a nightly backup and a wekkly index check ? I currently just have a Full Backup Database plan in place which occurs every Sunday. I don't have any other maintenance going on as we were doing that ad hoc during the week.

Do you have the time window to faciliate the above (or is it used 24x7 making life "difficult") ? When we are running our processes we want to keep any maintenance to a minimum.

Do you read my article yet ? Yes, it was very good thanks.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
You may also want to have a look at partition alignment. This may speed up your system a bit if you have not already found it and if you are not using server 2008.

http://support.microsoft.com/kb/929491
0
 

Author Comment

by:naexpert
Comment Utility
Hi, We're using Windows Server 2008 R2 Enterprise and SQL 2008but thanks for the kb anyway...
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Cool cool.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Any ideas what you want to do regarding this then?
0
 
LVL 51

Assisted Solution

by:Mark Wills
Mark Wills earned 250 total points
Comment Utility
(thanks for reading and just got notification that you voted too - much appreciated)

Yep, set size means the database size. If we know how much the database grows, then ideally we set the size of that database (to reserve the disk space) to accomodate that growth. Doesnt really matter too much if that is slightly more generous to begin with, but it does take some monitoring, and need to look at it over a coplete "cycle" ie everything that gets added, up to the time you run a purge or archiving routine.

Once you have set the size, you then normally defragment so that "big disk file" is in a clean state.

Then you can work on the internals. That means re indexing everything, not just the one table, you want it to be as clean as possible then try to keep it that way.

Having volatile data subject to updates and changes is always going to be difficult for you with some of those indexes and fragmentation - it could well be the nature of the beast... How did you decide what those indexes needed to be ?

Things like a single BIT as an index might not be achieving everything you think. It can be a bit deceptive (pardon the pun), because if there is insufficient variation of data within the index the query optimizer might well determine that there is no point using the index and default to a scan anyway.

Sometimes having more indexes than can be reasonably used does more harm than good. So, we really should look at those indexes. Then we also need to ask if fragmentation hurts (ie lots of seeks and not many scans).

Seeing as you have just started the machine recently, and rebuilt the indexes, then it would be interesting to see the usage so far of some of those indexes.

select object_name(s.object_id),i.name, s.*
from sys.dm_db_index_usage_stats s
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) = 'EMailValidatorResults'


select object_name(s.object_id),i.name, s.*
from sys.dm_db_index_operational_stats (db_id(),0,NULL,0) s
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) = 'EMailValidatorResults'
 

Are there entries in the second one that dont appear in the first one ? If yes, then those indexes are not actually being used. Maybe they have not undergone all the queries yet... but warrants closer inspection.

If the are in the first one, then, what are the user_seeks, scans and lookups like ? Are they actually being used ? Are they all scans (in which case fragmentation can hurt) ?

In terms of datatypes, then changing compatible datatypes typically wont hurt your processes at all - updates still happen, inserts happen etc, and the processes are none the wiser...

But there are some interesting things that can happen with unicode data (ie nvarchar() and nchar() ) for a start they take twice the space. Secondly, if comparing to varchar data then the varchar data might have to be converted to unicode first. It really does become fairly important to understand collations and unicode data, and hopefully no SQL collation to be seen. So, if you are linking unicode to a non unicode lookup then it can hurt performance considerably especially if that lookup is a SQL collation (hence the new recommendations to only use windows collations).

Hmmm... Would you mind doing a "create" script ? In SSMS right click on the table, script table as... then save those results and post them here ?



0
 

Author Comment

by:naexpert
Comment Utility
Hi Mark,

Here is the create script.


USE [Test]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[EMailValidatorResults](
      [id] [int] IDENTITY(1,1) NOT NULL,
      [extID] [nvarchar](250) NOT NULL,
      [EmailAddress] [nvarchar](318) NOT NULL,
      [BeenProcessed] [bit] NOT NULL,
      [InputsNeededCorrection] [bit] NOT NULL,
      [Syntax] [bit] NOT NULL,
      [NameServer] [bit] NOT NULL,
      [MailServer] [bit] NOT NULL,
      [MailServer1] [bit] NOT NULL,
      [IsR] [bit] NOT NULL,
      [CreationTimeUTC] [datetime] NOT NULL,
      [ModifiedTimeUTC] [datetime] NOT NULL,
      [ServerName] [nvarchar](150) NULL,
      [Responses] [nvarchar](4000) NULL,
      [ResponseCode] [int] NULL,
      [Response1] [nvarchar](512) NULL,
      [Lpart] [nvarchar](64) NOT NULL,
      [Dmn] [nvarchar](254) NOT NULL,
      [Ctgry] [nvarchar](64) NOT NULL,
      [List1] [bit] NOT NULL,
      [List2] [bit] NOT NULL,
      [DurationMSec] [int] NULL,
      [NextCheckTimeUTC] [datetime] NULL,
      [Source] [nvarchar](255) NOT NULL,
      [Analysis] [nvarchar](255) NULL,
      [ExDet] [bit] NOT NULL,
      [ExDmp] [nvarchar](4000) NULL,
      [RCnt] [int] NOT NULL,
      [BProc] [bit] NULL,
      [PrStartTimeUTC] [datetime] NULL,
      [Rtion] [nvarchar](255) NULL,
      [Title1] [nvarchar](100) NULL,
      [First1] [nvarchar](100) NULL,
      [Last1] [nvarchar](100) NULL,
      [Url] [nvarchar](250) NULL,
      [Index1] [int] NULL,
      [Index2] [int] NULL,
      [FNCon] [bit] NULL,
      [FICon] [bit] NULL,
      [LNCon] [bit] NULL,
      [LICon] [bit] NULL,
      [FIHyp] [bit] NULL,
      [FIIEx] [bit] NULL,
      [HWSR] [bit] NULL,
      [IDup] [bit] NULL,
      [RCode] [int] NULL,
      [RCodeDesc] [nvarchar](250) NULL,
      [EyType] [nchar](2) NULL,
      [RtCode] [int] NULL,
      [NkTrs] [int] NULL,
      [RtSc] [int] NULL,
      [CCo] [nchar](4) NULL,
      [JUi] [uniqueidentifier] NULL,
      [ECR] [nchar](32) NULL,
      [CteName] [nvarchar](250) NULL,
      [SCode] [nvarchar](50) NULL,
      [IsOrg] [bit] NULL,
      [Ref] [int] NULL,
      [IMas] [bit] NULL,
      [IsG] [bit] NULL,
      [IateResult] [bit] NULL,
      [PResponseCode] [nchar](5) NULL,
      [PResponseCodeDesc] [nvarchar](500) NULL,
      [Isdf1] [bit] NULL,
 CONSTRAINT [PK_EMailValidatorResults_id] PRIMARY KEY CLUSTERED
(
      [id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]

GO

0
 

Author Comment

by:naexpert
Comment Utility
Hi Mark,

I ran your scripts and all the rows that appear in the second set of results appear in the first set of results.

Out of the 11 indexes
3 have 0 user seeks. Does this mean that I can delete these indexes as they are not being used?
The other 8 indexes all have user_seeks against them and user_scans are all 0
The PK has 252,759 user_lookups but has 42,041,201 user_seeks.

Hope this makes some sense to you?
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
With reagrd to the 0 user seek indexes do they have 0 against their columns ie scans etc?

If not then these indexes do not seem to be used at all.

With regards to your primary key that looks good.

Earlier in the discussion you mentioned that there are 3 distinct steps in a records lifespan in this database.

Insert then update during processing then update when finished (or than may just be hw i have interpretted this).

Could you post the where clauses for these statements, we may be able to work out some better indexes for you.
0
 

Author Comment

by:naexpert
Comment Utility
Hi,

2 of the indexes have 0 in seeks, scans and lookups. The 3rd index just has 5 scans. I'm guessing that I can remove these 3 indexes?

Glad about the primary key!

The process we use is:-

Insert new rows of data. Most of fields starts off as either NULL or 0
Process the rows through our internal application and update bit fields with 0 or 1 and update nvarchar fields with data of differing lengths.
The bit fields can switch from 1 to 0 and  0 to 1 again as the rows is going through the processes.
The updates are in code, I'll ask the developer if he can provide me with the where clauses.

Thanks so much for sticking with this.
0
 

Author Comment

by:naexpert
Comment Utility
I've deleted the 3 indexes that were not being used.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
No probs for sticking with it, its always interesting to sort out issues as it keeps your brain in check.....
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Was having a think about the original insert statements and i had a bit of a brain wave. You said the upon insert a lot of these columns are set to null.

As long as these nulls are not used in a where clause and the data isnt update by concatonating the orginal value with a new one, what you could do would be to work out the average length of these varchar & nvarchar columns and then set a default value of blank spaces x average length. This would effectivly pad the field out ready for data to be updated at a later date and would minimize the pages expanding and page splitting.

I know its probably one of my less conventional ideas but it might be worth a go. You could also test this in another environment by backing up the database restoring it and replaying a trace.

Let me know if you think this is a good direction.

Just for my interest are you a DBA?
0
 

Author Comment

by:naexpert
Comment Utility
Hi 'Evil' (I'm sure you're not!)

I'm a 'jack of all trades', carry out IT admin, tech support, and basic DBA work i.e backing up, restoring etc. for a very small software company.

Cheers.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
So an accidental DBA then ;)
0
 

Author Comment

by:naexpert
Comment Utility
Hi 'Evil'

Your suggestion might be worth thinking about, I'll discuss with developer and see if its viable.
0
 

Author Comment

by:naexpert
Comment Utility
Yes, absolutely, an accidental DBA, nice title!
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
I would test it thoroughly though to make sure it doesnt break stuff (A nice technical term i know).
0
 

Author Comment

by:naexpert
Comment Utility
I use that technical term quite a lot! Thanks for your help so far...
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
G'day, I'mmmm Baaaack  (hmmm, that doesnt look like the way it sounded)

Was hoping to see the indexes before dropping any of them :)

Please run :
select object_name(i.object_id) as table_name
, i.name as index_name
, i.type_desc
, case when is_unique = 1 then 'Yes' else 'No' end as unique_key
, case when is_primary_key = 1 then 'Yes' else 'No' end as primary_key
, c.key_ordinal
, a.name as column_name
, type_name(a.user_type_id)+case when type_name(a.user_type_id) like '%char%' then '('+convert(varchar,a.max_length)+')'
                                 when type_name(a.user_type_id) = 'decimal' then '('+convert(varchar,a.precision)+','+convert(varchar,isnull(a.scale,0))+')'
                                 else '' end as column_type
, case when is_identity = 1 then 'Yes' else 'No' end as identity_column
, case when is_included_column = 1 then 'Yes' else 'No' end as included_column
from sys.indexes i
inner join sys.index_columns c on i.object_id = c.object_id and i.index_id = c.index_id
inner join sys.columns a on c.object_id = a.object_id and c.column_id = a.column_id
inner join sys.objects o on i.object_id = o.object_id and o.type_desc like '%USER%'
where object_name(i.object_id) = 'EMailValidatorResults'
order by 1,2,3,4,5,6


Now the index stats, we normally use the 0's as an indicator to ask a few more questions - like "have all normal processes happened on this table" there might a a periodical report (for example) that has yet to run and might hit some of those other indexes. But no major problem, if worse comes to worse, then just recreate the indexes.

OK, let's also check collations while we are here...

select serverproperty('collation')
SELECT   TABLE_NAME as tablename
       , COLUMN_NAME as columnname
       , serverproperty('collation') as server_collation
       , case when databasepropertyex(table_catalog, 'CollationName') = serverproperty('collation') then ''
              else databasepropertyex(table_catalog, 'CollationName')
         end  as database_collation
       , case when COLLATION_NAME = databasepropertyex(table_catalog, 'CollationName') then ''
              when COLLATION_NAME = serverproperty('collation') then ''
              else COLLATION_NAME
         end  as column_collation
FROM information_schema.columns
WHERE table_name = 'months'
and serverproperty('collation') not in (databasepropertyex(table_catalog, 'CollationName'), collation_name)

One more question... do columns have to be NVARCHAR() or NCHAR() ?  Do you have a unicode environment ?
0
 

Author Comment

by:naexpert
Comment Utility
Here are the results from the 1st query
table_name¬index_name¬type_desc¬unique_key¬primary_key¬key_ordinal¬column_name¬column_type¬identity_column¬included_column
EMailValidatorResults¬IX_EMailValidatorResults_BeenProcessed¬NONCLUSTERED¬No¬No¬1¬BeenProcessed¬bit¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_BeingProcessed¬NONCLUSTERED¬No¬No¬1¬BeingProcessed¬bit¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_ClientCode¬NONCLUSTERED¬No¬No¬1¬ClientCode¬nchar(8)¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_EntityType¬NONCLUSTERED¬No¬No¬1¬EntityType¬nchar(4)¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_JobUID¬NONCLUSTERED¬No¬No¬1¬JobUid¬uniqueidentifier¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_NextCheckTime¬NONCLUSTERED¬No¬No¬1¬NextCheckTimeUTC¬datetime¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_ResultCode¬NONCLUSTERED¬No¬No¬1¬ResultCode¬int¬No¬No
EMailValidatorResults¬PK_EMailValidatorResults_id¬CLUSTERED¬Yes¬Yes¬1¬id¬int¬Yes¬No

Results from 2nd query
Latin1_General_CI_AS

Regarding Unicode, we are operating with unicode data and moving forward, we will be using more of it.

I have set up a reorganise index job on the server that runs once a day, this seems to be working nicely keeping the fragmentation down to a reasonable level.

Shall we wrap this up and put it down to a learning experience, for me that is! I'm happy to split the points between the both of you as you have both been so helpful, what do you think?
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Im happy with that. BTW let me know what you decide to go with.
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
Good - windows collation - important for unicode.

recommend looking into IX_EMailValidatorResults_BeenProcessed and IX_EMailValidatorResults_BeingProcessed maybe check the queries that use those two and find some additional predicates used in those queries to help uniqueness.

Maybe resultcode as well.

The character based indexes are already NCHAR() which is a good thing.

And do the backup each night too.... normally three maintenance plans - full backup, trans log backup (if needed), maintenance (like indexing).

And yep, can call it quits if you are reasonably happy with the answers you have received so far...
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Lets not get start on Maintenance Plan vs Custom Jobs.... ;)
0
 
LVL 51

Expert Comment

by:Mark Wills
Comment Utility
Oh, and keep those statistics updated with frequently changing data... (but that is a different question now - more to do with performance *laughing*)
0
 

Author Closing Comment

by:naexpert
Comment Utility
I've split the points because both 'Evil' and Mark were so helpful. I've learnt some new tricks during this process so thanks again you two.
0
 
LVL 16

Expert Comment

by:EvilPostIt
Comment Utility
Oh yeah nearly forgot, have a look into filtered indexes this may also help you.....

http://technet.microsoft.com/en-us/library/cc280372.aspx
0
 

Author Comment

by:naexpert
Comment Utility
Thanks 'evil', could be very useful, cheers, naexpert.
0
 

Expert Comment

by:MrVault
Comment Utility
Had to leave a comment - I just read this whole thing and have to say I'm blown away that you two helped him so much without ever getting your hands on the server. As a fellow "Accidental DBA" I KNOW there are many tweaks our company needs to do to our SQL environment and I wish you two were sitting here so help. Kudos to both of you for helping a stranger out like so.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

PL/SQL can be a very powerful tool for working directly with database tables. Being able to loop will allow you to perform more complex operations, but can be a little tricky to write correctly. This article will provide examples of basic loops alon…
Use this article to create a batch file to backup a Microsoft SQL Server database to a Windows folder.  The folder can be on the local hard drive or on a network share.  This batch file will query the SQL server to get the current date & time and wi…
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now