Link to home
Start Free TrialLog in
Avatar of naexpert
naexpert

asked on

Index fragmentation

We have a table with a number of indexes inc a primary key (int). The table rows are being update a lot. Why would the Primary Key become fragmented? it's running at 61% but has been up to 98% fragmentation. I can understand why the other indexes would become fragmented but why the PK? Do you have any suggestions on how the fragmentation can be minimised?  We need the indexes to speed up the read part of the process.
Any pointers would be most welcome.....
Avatar of subhashpunia
subhashpunia
Flag of India image

PK is fragmented because of Page Splitting.  For more information please check this nice article:

http://blog.sqlauthority.com/2010/01/12/sql-server-fragmentation-detect-fragmentation-and-eliminate-fragmentation/
Avatar of EvilPostIt
It all very much depends in the data your primary key is composed of, for example if you are using int identity(1,1) as your primary key and the clustered index is ontop of this, the index is less likely to fragment whereas if you were using a guid then new records could potentially be inserted anywhere in the table which would cause page splits and fragmentation unless you are using newsequentialguid.

Could you post the create scripts for your table / indexes it will be easier to determine why they are fragmenting.

Thanks
Avatar of naexpert
naexpert

ASKER

Hi,

Here are the create scripts for the table. I have not included all the fields in the table.


SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[EMailValidatorResults](
      [id] [int] IDENTITY(1,1) NOT NULL,
      [BeenProcessed] [bit] NOT NULL,
      [ModifiedTimeUTC] [datetime] NOT NULL,
      [NextCheckTimeUTC] [datetime] NULL,
      [BeingProcessed] [bit] NULL,
 CONSTRAINT [PK_EMailValidatorResults_id] PRIMARY KEY CLUSTERED
(
      [id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]

GO



CREATE NONCLUSTERED INDEX [IX_EMailValidatorResults_BeenProcessed] ON [dbo].[EMailValidatorResults]
(
      [BeenProcessed] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
GOUSE [DQEmail]
GO

CREATE NONCLUSTERED INDEX [IX_EMailValidatorResults_BeingProcessed] ON [dbo].[EMailValidatorResults]
(
      [BeingProcessed] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
GO


CREATE NONCLUSTERED INDEX [IX_EMailValidatorResults_NextCheckTime] ON [dbo].[EMailValidatorResults]
(
      [NextCheckTimeUTC] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
GO


Which indexes are getting fragmented?
PK_EMailValidatorResults_id 61%
BeenProcessed 79%
BeingProcessed 74%
NextCheckTimeUTC 81%

Ok, so when you say the a lot of table rows are being updated which data type are these updates taking place against and how do they change?

Also how many records are in your table? If there are not many then this may not even be an issue.
Hi,

There are currently 1.2 million rows in the table.

PK_EMailValidatorResults_id is a INT - doesn't change.
BeenProcessed is a BIT - Starts as 0 and changes to 1
BeingProcessed  is a BIT Starts as 0 and changes to 1 and then back to 0
NextCheckTimeUTC is a DateTime - and gets updated and then nulled when process on this row is completed.
Are there any shrink operations scheduled to happen on this database?
No, just a weekly backup of database.
and the database definatly isnt set to auto_shrink.
Could you run the attatched query and post the results please?

select index_type_desc,index_id,index_depth,avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages from sys.dm_db_index_physical_stats(db_id(),null,null,null,null)
where object_name(object_id)='EMailValidatorResults'

Open in new window

Here are the results from your query.

index_type_desc,index_id,index_depth,avg_fragmentation_in_percent,fragment_count,avg_fragment_size_in_pages
CLUSTERED INDEX,1,4,2.31009794590235,23360,18.2603595890411
CLUSTERED INDEX,1,1,0,NULL,NULL
NONCLUSTERED INDEX,3,3,0.0656167979002625,68,22.4117647058824
NONCLUSTERED INDEX,4,3,1.31764705882353,84,25.297619047619
NONCLUSTERED INDEX,5,3,0,75,45.7066666666667
NONCLUSTERED INDEX,6,3,0,52,51.2692307692308
NONCLUSTERED INDEX,7,3,0.0340097494615123,221,39.9140271493213
NONCLUSTERED INDEX,8,3,0,5,987.2
NONCLUSTERED INDEX,9,3,19.3474264705882,856,5.08411214953271
NONCLUSTERED INDEX,10,3,0,3,888.666666666667
NONCLUSTERED INDEX,11,3,8.67383512544803,474,8.82911392405063
NONCLUSTERED INDEX,21,3,0.112443778110945,33,80.8484848484848
Only one of your indexes is fragmented more than 2%. You want to be looking at the avg_fragmentation_in_percent column to see how fragmented your indexes are. Is this where you got your fragmentation percentages from before?
I have run a test of this based upon you db structure stated above and data change rate stated above and have been unable to get these indexes to fragment more than 2%-3%. I even inserted 1 million rows and updated 100k of them twice within a loop. You did mention that you had removed some of the other columns.

Unless these other columns are blank to begin with and then populated or change width i can see any reason why a lot of fragmentation would happen.
Sorry for the misleading numbers, between our conversation, I dropped the indexes and re-created them, that's why the percentages are so low. Prior to me doing this they were as stated before:-
PK_EMailValidatorResults_id 61%
BeenProcessed 79%
BeingProcessed 74%
NextCheckTimeUTC 81%
Please could you read my previous post and comment. I havnt been able to get these indexes to fragment, do you mind telling me what data types the other columns are? And also how these columns are updated?

Thanks
Why are you dropping and re-creating the indexes? The better way to do this (depending on fragmentation) would be to use ALTER INDEX [INDEX NAME or ALL] ON [TABLENAME] REBUILD
There is an "excellent" article

https://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/A_691-Managing-Fragmentation-for-the-Accidental-DBA.html

Those nonclustered indexes are going to be all over the place, they seem to be the type that are regularly updated, and/or, very little uniqueness so might not even be doing too much.

If those BIT columns are used in conjunction with other data then might consider adding more to those indexes...

So, why is the PK being fragmented ? Read the article :)
Hi 'Evil',

I don't normally drop the indexes. I did this one time because I thought that dropping them before defraging the hard drive might be prudent. Anyway, as it happens, the data hard drive was hard fragmented so that was okay.

There are 10 non-clustered indexes plus the Clustered primary key which an int.

Here are the columns in the table.

      [id] [int] IDENTITY(1,1) NOT NULL,
      [extID] [nvarchar](250) NOT NULL,
      [EmA1] [nvarchar](318) NOT NULL,
      [BPro] [bit] NOT NULL,
      [NCorr] [bit] NOT NULL,
      [PSyn] [bit] NOT NULL,
      [DNaerver] [bit] NOT NULL,
      [HsMver] [bit] NOT NULL,
      [VBerver] [bit] NOT NULL,
      [Isal] [bit] NOT NULL,
      [CreationTimeUTC] [datetime] NOT NULL,
      [ModifiedTimeUTC] [datetime] NOT NULL,
      [SName] [nvarchar](150) NULL,
      [SerReonses] [nvarchar](4000) NULL,
      [Resode] [int] NULL,
      [Resonse] [nvarchar](512) NULL,
      [Local] [nvarchar](64) NOT NULL,
      [Doin] [nvarchar](254) NOT NULL,
      [Catery] [nvarchar](64) NOT NULL,
      [List1] [bit] NOT NULL,
      [List2] [bit] NOT NULL,
      [DurationMSec] [int] NULL,
      [NextCheckTimeUTC] [datetime] NULL,
      [Source] [nvarchar](255) NOT NULL,
      [Analysis] [nvarchar](255) NULL,
      [ExcDetected] [bit] NOT NULL,
      [ExcDump] [nvarchar](4000) NULL,
      [RetCnt] [int] NOT NULL,
      [BeingProcessed] [bit] NULL,
      [ProcessingStartTimeUTC] [datetime] NULL,
      [Resolution] [nvarchar](255) NULL,
      [Title1] [nvarchar](100) NULL,
      [First1] [nvarchar](100) NULL,
      [Last1] [nvarchar](100) NULL,
      [Url] [nvarchar](250) NULL,
      [Index] [int] NULL,
      [IndexLength] [int] NULL,
      [NameCong1] [bit] NULL,
      [InitialCong] [bit] NULL,
      [NameCong2] [bit] NULL,
      [InitialCong] [bit] NULL,
      [Expansion] [bit] NULL,
      [Web] [bit] NULL,
      [Duplicate] [bit] NULL,
      [ResultCode] [int] NULL,
      [ResultCodeDesc] [nvarchar](250) NULL,
      [Entity] [nchar](2) NULL,
      [ResolutionCode] [int] NULL,
      [kTransactions] [int] NULL,
      [ResultScore] [int] NULL,
      [ClientCode] [nchar](4) NULL,
      [JobUid] [uniqueidentifier] NULL,
      [ECR] [nchar](32) NULL,
      [CorporateName] [nvarchar](250) NULL,
      [SICCode] [nvarchar](50) NULL,
      [IsOrg] [bit] NULL,
      [PK_Reference] [int] NULL,
      [IsMaster] [bit] NULL,
      [IsGeneric] [bit] NULL,
      [IndiscriminateResult] [bit] NULL,
      [Code1] [nchar](5) NULL,
      [CodeDesc1] [nvarchar](500) NULL,
      [IsSpap] [bit] NULL,

The process uses bog standard update statements in the code.

Thanks so much for taking the time to look into this for me.
Ok I have a few questions now.

1) is the clustered index built exactly how you stated above (just has the id field).
2) During the process of updating after the initial insert does the length of any of these nvarchar / varchar fields change?
3) How bothered about space are you......?
If the hard drive was fragmented, what did you do next ?

Did you clean up the database - doing a full backup, reindex, set size - that type of thing ?

Is it set to autogrow (or course it is) but in terms of it is always continually trying to grow it self, or has the size of the database been established to minimize autogrow ?

When was the last time you did a reindex ?

Do you run regular maintenance plans - like a nightly backup and a wekkly index check ?

Do you have the time window to faciliate the above (or is it used 24x7 making life "difficult") ?

Do you read my article yet ?
Hi Mark,

Thanks for your post regarding your 'Excellent' article, very good reading. I can now see why setting the initial size of the database is so important. I must be missing something fundamental though. If the identity was created when the table was created I would have thought that that clustered index would not become fragmented, but it does, why is that?
Hi 'Evil'

1) is the clustered index built exactly how you stated above (just has the id field). Yes.
2) During the process of updating after the initial insert does the length of any of these nvarchar / varchar fields change? No.
3) How bothered about space are you......? Not too bothered.

Thanks again for your help.
ASKER CERTIFIED SOLUTION
Avatar of EvilPostIt
EvilPostIt
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Just read your post responding to my questions. Ok well that idea is out of window then. Back to the drawing board then.
Hi 'Evil'

I misunderstood the middle question, or some reason I was thinking of the field length not the insertion length:-

Your question was:- During the process of updating after the initial insert does the length of any of these nvarchar / varchar fields change? The answer is yes, as we are updating some nvarchar fields with a lot of data. So your previous explanation does make sense.

We really don't want to re-engineer the software processes that update this table.

Thanks again.
This isnt really a re-engineer its more of a tweak for performance. As all you are effectivly changing is the way in which the data is being stored at the pages level.

VARCHAR(200) - CHAR(200)
NVARCHAR(200) - NCHAR(200)

Due to the nature of your application the fragmentation will keep happening and with higher volumes of data going through may just keep getting worse.

That said though what you could do is rebuild the index in a quite period regularly with say an 80% fill factor. This may help.
Nice posts EvilPostIt, and a warm welcome to EE by the way :)

Just following on a bit... A clustered index does not gaurantee that there will be no fragmentation, a rebuild will try to use contiguous pages and eliminate that fragmentation, but there are quite a few different scenarios at play...

EvilPostIt talks about large data, and with a few bytes of overheads those 1000bytes might consume 1001 bytes, or there is a fill factor reserving some empty space, so instead of 8 rows per page, there is only seven, but large gaps, or empty space in between and available for use by almost anything else. So, another small row comes along and fits neatly into that space. And so it goes...

It is very difficult when working with such potentially large rows, if they are intitially kinda empty and then updated with some volume then it no longer fits into its current allocation. So, it puts a pointer to where the new data now lives. Well a clustered index keeps the data on the leaf nodes, so guess what, it is now fragmented because of a simple update.

The table is a big long table with things like code1 and codedesc1 - so guessing it is used for some kind of datawarehouse ?

Are the columns really unicode ? ie nvarchar ?



Also, if you could please spare a moment to consider some of the previous questions, might be able to come up with some good strategies for you...
Thanks mark,

Wasnt sure if you were going to be the type to pick appart my post where ever possible as im sure there are a few missunderstandings there :) thanks for not doing so.

With regards to fill factor bear in mind this fill factor is only maintained at index creation / rebuild time.
@EvilPostIt, enjoy seeing more experts who know what they are talking about... It is more about collaboration and helping the "asker". And the occasional point or three of course *laughing*
Hi Mark,

Here are the answers to your questions:-

If the hard drive was fragmented, what did you do next ? I dropped all the indexes on the database/table in question, I then stopped all the SQL Services, I then used Windows 2008 own Defrag tool. As it happend, there was 7% fragmentation which  i don't think is too bad. Carried out defrag anyway. Then restarted SQL services, then re-created all the indexes, clustered index was first one created. I have also set the Recovery Model from Full to Simple thinking that the logs won't get so big.

Did you clean up the database - doing a full backup, reindex, set size - that type of thing ? The last backup was Sunday just gone. Re-created all indexes with fill factor of 80. Set size? do you mean the database size?

Is it set to autogrow (or course it is) but in terms of it is always continually trying to grow it self, or has the size of the database been established to minimize autogrow ? It was set to Autogrow (File Growth 10mb and Max File Size = Unrestricted) I have since increased the File Growth to 100mb, not sure if this will help at all?

When was the last time you did a reindex ? Before I dropped and re-created indexes, the indexes were rebuilt on Monday. We are having to do this every other day as some of the indexes are running up to 80-90% fragmented.

Do you run regular maintenance plans - like a nightly backup and a wekkly index check ? I currently just have a Full Backup Database plan in place which occurs every Sunday. I don't have any other maintenance going on as we were doing that ad hoc during the week.

Do you have the time window to faciliate the above (or is it used 24x7 making life "difficult") ? When we are running our processes we want to keep any maintenance to a minimum.

Do you read my article yet ? Yes, it was very good thanks.
You may also want to have a look at partition alignment. This may speed up your system a bit if you have not already found it and if you are not using server 2008.

http://support.microsoft.com/kb/929491
Hi, We're using Windows Server 2008 R2 Enterprise and SQL 2008but thanks for the kb anyway...
Cool cool.
Any ideas what you want to do regarding this then?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi Mark,

Here is the create script.


USE [Test]
GO

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[EMailValidatorResults](
      [id] [int] IDENTITY(1,1) NOT NULL,
      [extID] [nvarchar](250) NOT NULL,
      [EmailAddress] [nvarchar](318) NOT NULL,
      [BeenProcessed] [bit] NOT NULL,
      [InputsNeededCorrection] [bit] NOT NULL,
      [Syntax] [bit] NOT NULL,
      [NameServer] [bit] NOT NULL,
      [MailServer] [bit] NOT NULL,
      [MailServer1] [bit] NOT NULL,
      [IsR] [bit] NOT NULL,
      [CreationTimeUTC] [datetime] NOT NULL,
      [ModifiedTimeUTC] [datetime] NOT NULL,
      [ServerName] [nvarchar](150) NULL,
      [Responses] [nvarchar](4000) NULL,
      [ResponseCode] [int] NULL,
      [Response1] [nvarchar](512) NULL,
      [Lpart] [nvarchar](64) NOT NULL,
      [Dmn] [nvarchar](254) NOT NULL,
      [Ctgry] [nvarchar](64) NOT NULL,
      [List1] [bit] NOT NULL,
      [List2] [bit] NOT NULL,
      [DurationMSec] [int] NULL,
      [NextCheckTimeUTC] [datetime] NULL,
      [Source] [nvarchar](255) NOT NULL,
      [Analysis] [nvarchar](255) NULL,
      [ExDet] [bit] NOT NULL,
      [ExDmp] [nvarchar](4000) NULL,
      [RCnt] [int] NOT NULL,
      [BProc] [bit] NULL,
      [PrStartTimeUTC] [datetime] NULL,
      [Rtion] [nvarchar](255) NULL,
      [Title1] [nvarchar](100) NULL,
      [First1] [nvarchar](100) NULL,
      [Last1] [nvarchar](100) NULL,
      [Url] [nvarchar](250) NULL,
      [Index1] [int] NULL,
      [Index2] [int] NULL,
      [FNCon] [bit] NULL,
      [FICon] [bit] NULL,
      [LNCon] [bit] NULL,
      [LICon] [bit] NULL,
      [FIHyp] [bit] NULL,
      [FIIEx] [bit] NULL,
      [HWSR] [bit] NULL,
      [IDup] [bit] NULL,
      [RCode] [int] NULL,
      [RCodeDesc] [nvarchar](250) NULL,
      [EyType] [nchar](2) NULL,
      [RtCode] [int] NULL,
      [NkTrs] [int] NULL,
      [RtSc] [int] NULL,
      [CCo] [nchar](4) NULL,
      [JUi] [uniqueidentifier] NULL,
      [ECR] [nchar](32) NULL,
      [CteName] [nvarchar](250) NULL,
      [SCode] [nvarchar](50) NULL,
      [IsOrg] [bit] NULL,
      [Ref] [int] NULL,
      [IMas] [bit] NULL,
      [IsG] [bit] NULL,
      [IateResult] [bit] NULL,
      [PResponseCode] [nchar](5) NULL,
      [PResponseCodeDesc] [nvarchar](500) NULL,
      [Isdf1] [bit] NULL,
 CONSTRAINT [PK_EMailValidatorResults_id] PRIMARY KEY CLUSTERED
(
      [id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = ON, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 80) ON [PRIMARY]
) ON [PRIMARY]

GO

Hi Mark,

I ran your scripts and all the rows that appear in the second set of results appear in the first set of results.

Out of the 11 indexes
3 have 0 user seeks. Does this mean that I can delete these indexes as they are not being used?
The other 8 indexes all have user_seeks against them and user_scans are all 0
The PK has 252,759 user_lookups but has 42,041,201 user_seeks.

Hope this makes some sense to you?
With reagrd to the 0 user seek indexes do they have 0 against their columns ie scans etc?

If not then these indexes do not seem to be used at all.

With regards to your primary key that looks good.

Earlier in the discussion you mentioned that there are 3 distinct steps in a records lifespan in this database.

Insert then update during processing then update when finished (or than may just be hw i have interpretted this).

Could you post the where clauses for these statements, we may be able to work out some better indexes for you.
Hi,

2 of the indexes have 0 in seeks, scans and lookups. The 3rd index just has 5 scans. I'm guessing that I can remove these 3 indexes?

Glad about the primary key!

The process we use is:-

Insert new rows of data. Most of fields starts off as either NULL or 0
Process the rows through our internal application and update bit fields with 0 or 1 and update nvarchar fields with data of differing lengths.
The bit fields can switch from 1 to 0 and  0 to 1 again as the rows is going through the processes.
The updates are in code, I'll ask the developer if he can provide me with the where clauses.

Thanks so much for sticking with this.
I've deleted the 3 indexes that were not being used.
No probs for sticking with it, its always interesting to sort out issues as it keeps your brain in check.....
Was having a think about the original insert statements and i had a bit of a brain wave. You said the upon insert a lot of these columns are set to null.

As long as these nulls are not used in a where clause and the data isnt update by concatonating the orginal value with a new one, what you could do would be to work out the average length of these varchar & nvarchar columns and then set a default value of blank spaces x average length. This would effectivly pad the field out ready for data to be updated at a later date and would minimize the pages expanding and page splitting.

I know its probably one of my less conventional ideas but it might be worth a go. You could also test this in another environment by backing up the database restoring it and replaying a trace.

Let me know if you think this is a good direction.

Just for my interest are you a DBA?
Hi 'Evil' (I'm sure you're not!)

I'm a 'jack of all trades', carry out IT admin, tech support, and basic DBA work i.e backing up, restoring etc. for a very small software company.

Cheers.
So an accidental DBA then ;)
Hi 'Evil'

Your suggestion might be worth thinking about, I'll discuss with developer and see if its viable.
Yes, absolutely, an accidental DBA, nice title!
I would test it thoroughly though to make sure it doesnt break stuff (A nice technical term i know).
I use that technical term quite a lot! Thanks for your help so far...
G'day, I'mmmm Baaaack  (hmmm, that doesnt look like the way it sounded)

Was hoping to see the indexes before dropping any of them :)

Please run :
select object_name(i.object_id) as table_name
, i.name as index_name
, i.type_desc
, case when is_unique = 1 then 'Yes' else 'No' end as unique_key
, case when is_primary_key = 1 then 'Yes' else 'No' end as primary_key
, c.key_ordinal
, a.name as column_name
, type_name(a.user_type_id)+case when type_name(a.user_type_id) like '%char%' then '('+convert(varchar,a.max_length)+')'
                                 when type_name(a.user_type_id) = 'decimal' then '('+convert(varchar,a.precision)+','+convert(varchar,isnull(a.scale,0))+')'
                                 else '' end as column_type
, case when is_identity = 1 then 'Yes' else 'No' end as identity_column
, case when is_included_column = 1 then 'Yes' else 'No' end as included_column
from sys.indexes i
inner join sys.index_columns c on i.object_id = c.object_id and i.index_id = c.index_id
inner join sys.columns a on c.object_id = a.object_id and c.column_id = a.column_id
inner join sys.objects o on i.object_id = o.object_id and o.type_desc like '%USER%'
where object_name(i.object_id) = 'EMailValidatorResults'
order by 1,2,3,4,5,6


Now the index stats, we normally use the 0's as an indicator to ask a few more questions - like "have all normal processes happened on this table" there might a a periodical report (for example) that has yet to run and might hit some of those other indexes. But no major problem, if worse comes to worse, then just recreate the indexes.

OK, let's also check collations while we are here...

select serverproperty('collation')
SELECT   TABLE_NAME as tablename
       , COLUMN_NAME as columnname
       , serverproperty('collation') as server_collation
       , case when databasepropertyex(table_catalog, 'CollationName') = serverproperty('collation') then ''
              else databasepropertyex(table_catalog, 'CollationName')
         end  as database_collation
       , case when COLLATION_NAME = databasepropertyex(table_catalog, 'CollationName') then ''
              when COLLATION_NAME = serverproperty('collation') then ''
              else COLLATION_NAME
         end  as column_collation
FROM information_schema.columns
WHERE table_name = 'months'
and serverproperty('collation') not in (databasepropertyex(table_catalog, 'CollationName'), collation_name)

One more question... do columns have to be NVARCHAR() or NCHAR() ?  Do you have a unicode environment ?
Here are the results from the 1st query
table_name¬index_name¬type_desc¬unique_key¬primary_key¬key_ordinal¬column_name¬column_type¬identity_column¬included_column
EMailValidatorResults¬IX_EMailValidatorResults_BeenProcessed¬NONCLUSTERED¬No¬No¬1¬BeenProcessed¬bit¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_BeingProcessed¬NONCLUSTERED¬No¬No¬1¬BeingProcessed¬bit¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_ClientCode¬NONCLUSTERED¬No¬No¬1¬ClientCode¬nchar(8)¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_EntityType¬NONCLUSTERED¬No¬No¬1¬EntityType¬nchar(4)¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_JobUID¬NONCLUSTERED¬No¬No¬1¬JobUid¬uniqueidentifier¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_NextCheckTime¬NONCLUSTERED¬No¬No¬1¬NextCheckTimeUTC¬datetime¬No¬No
EMailValidatorResults¬IX_EMailValidatorResults_ResultCode¬NONCLUSTERED¬No¬No¬1¬ResultCode¬int¬No¬No
EMailValidatorResults¬PK_EMailValidatorResults_id¬CLUSTERED¬Yes¬Yes¬1¬id¬int¬Yes¬No

Results from 2nd query
Latin1_General_CI_AS

Regarding Unicode, we are operating with unicode data and moving forward, we will be using more of it.

I have set up a reorganise index job on the server that runs once a day, this seems to be working nicely keeping the fragmentation down to a reasonable level.

Shall we wrap this up and put it down to a learning experience, for me that is! I'm happy to split the points between the both of you as you have both been so helpful, what do you think?
Im happy with that. BTW let me know what you decide to go with.
Good - windows collation - important for unicode.

recommend looking into IX_EMailValidatorResults_BeenProcessed and IX_EMailValidatorResults_BeingProcessed maybe check the queries that use those two and find some additional predicates used in those queries to help uniqueness.

Maybe resultcode as well.

The character based indexes are already NCHAR() which is a good thing.

And do the backup each night too.... normally three maintenance plans - full backup, trans log backup (if needed), maintenance (like indexing).

And yep, can call it quits if you are reasonably happy with the answers you have received so far...
Lets not get start on Maintenance Plan vs Custom Jobs.... ;)
Oh, and keep those statistics updated with frequently changing data... (but that is a different question now - more to do with performance *laughing*)
I've split the points because both 'Evil' and Mark were so helpful. I've learnt some new tricks during this process so thanks again you two.
Oh yeah nearly forgot, have a look into filtered indexes this may also help you.....

http://technet.microsoft.com/en-us/library/cc280372.aspx
Thanks 'evil', could be very useful, cheers, naexpert.
Had to leave a comment - I just read this whole thing and have to say I'm blown away that you two helped him so much without ever getting your hands on the server. As a fellow "Accidental DBA" I KNOW there are many tweaks our company needs to do to our SQL environment and I wish you two were sitting here so help. Kudos to both of you for helping a stranger out like so.