Solved

how does sql insert data into indexes?

Posted on 2011-03-25
15
216 Views
Last Modified: 2012-05-11
We have a high IO database with a table that has a clustered index and 3 non clustered indexes.

when data is inserted into the table (clustered index), does it try to write the columns for each non clustered index at the same time or sequentially? can you point to an article talking about this?

reason I ask is we had a 6 disk raid10 holding all 3 non clustered indexes that had an average disk second per transfer of 120 ms. we separated the set into 3 raid1 sets (2 disks per) and put one non clustered index on each set. the ADs/T for each drive is around 40 ms. I'm wondering if the whole transaction that updates all 3 indexes if it still takes 120ms to complete or if it's done in parallel.

yes I know 40ms is a bad ADs/T to get. Considering we were in the 200's originally, getting down to 40 is peachy for us, though of course we're looking to improve.
0
Comment
Question by:MrVault
  • 6
  • 4
  • 2
  • +1
15 Comments
 
LVL 39

Accepted Solution

by:
lcohan earned 500 total points
ID: 35218982
I guess the articles below explains in detail how they function:

http://www.sqlteam.com/article/sql-server-indexes-the-basics
http://www.sqlskills.com/BLOGS/KIMBERLY/category/Clustering-Key.aspx

As far as performance goes here's what I did for our 24/7 SQL db about 1.7TB in size - split all NONCLUSTERED indexes on separate filegoup/physical file, and changed indexes on very large tables that had the clustered column in it to the INCLUDE. This way we notice inserts/maintenance and IO much better and the queires suffered minimal and not noticeable impact.

For your second quetion I think they are done in parralell so is 40 in total best case scenario not 40+40+40
0
 

Author Comment

by:MrVault
ID: 35232136
I'm not sure I understand what you mean abou tthe "include" thing. How did you measure file IO without each filegroup file being on it's own separate spindle?

any chance you know of a document that goes over the parallel topic?
0
 
LVL 39

Expert Comment

by:lcohan
ID: 35233394
Assume Customers table is CLUSTERED on StoreId so when you create an index you can do it like:

CREATE NONCLUSTERED INDEX [idx_Customers] ON [dbo].[Customers]
(Customerid,StoreId)
WITH (FILLFACTOR = 80, ONLINE=ON, SORT_IN_TEMPDB = ON, MAXDOP = 1)  ON [INDEX];

--OR like below and I'd choose second option:
CREATE NONCLUSTERED INDEX [idx_Customers] ON [dbo].[Customers]
(Customerid)
INCLUDE (StoreId)
WITH (FILLFACTOR = 80, ONLINE=ON, SORT_IN_TEMPDB = ON, MAXDOP = 1)  ON [INDEX];

Of course the most accurate is to have each file on its own spindle however you can measure IO by using Windos Perfmon - Logical Disk counters.

Especialy we have the benefit of using SQL 2008 for index parallel operations please see:
http://msdn.microsoft.com/en-us/library/ms191292.aspx
0
 

Author Comment

by:MrVault
ID: 35233796
with logical disk counters that would assume we partitioned the same volume to separate ones for each file though right?
0
 
LVL 39

Expert Comment

by:lcohan
ID: 35234365
Right - and of course it is not as acurate as having a phisical file on its own RAID/Spindle but you could stil somwhat get an idea of the IO if each file is placed in its own logical even though the logicals are on same raid.
0
 

Author Comment

by:MrVault
ID: 35300241
thanks. doesn't that article though talk about actions against the same index in parallel? I'm trying to see if a key is in multiple indexes that you delete/update/add/remove if it does the action to all indexes at once. maybe I'm misreading it?
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 

Author Comment

by:MrVault
ID: 36279509
just wondering if you have an asnwer to this last question. If there are 4 non clustered indexes on one table, each with varying columns but each all having one of their columns being the same (say StoreID for example) and an operation is done where each index would need to be udpated (add/delete etc), will it try to do those operations sequentially or in parallel? Meaning update index1 first, then 2, 3, and 4, or if they're all on separate disks update them all at the same time.
0
 
LVL 47

Expert Comment

by:dlethe
ID: 37021573
Technically it must be sequentially because of journaling, but real-world, this can happen so quickly (especially with write-back caching) that it can seem it is in parallel.
0
 

Author Comment

by:MrVault
ID: 37021597
Can you explain the journaling piece?
0
 
LVL 68

Expert Comment

by:Qlemo
ID: 37022911
I don't agree that journaling is a reason for changing the indexes sequentially. There is no reason to serialize the update of non-clustered indexes here. Whenever a clustered index is to be updated, that one needs to be done first, since the other indexes contain pointers to that index. If only non-clustered indexes are to be changed, that can be well done in parallel. Whether it is done in parallel or not is unknown by me, but I assume the former.
0
 
LVL 39

Expert Comment

by:lcohan
ID: 37025061
In my opinion you should never split your CLUSTERED index from the table DATA and you should always create them on the same FILEGROUP where your table is located.
NONCLUSTERED indexes - different story. I'll try find Microsoft article to support my first statement.
0
 
LVL 68

Expert Comment

by:Qlemo
ID: 37025172
??? We are speaking of MSSQL server here, where a Clustered Index is automatically determining the physical storage structure of the table, since it is the table.
0
 

Author Comment

by:MrVault
ID: 37063598
But isn't the file where the table is located the main MDF file by default? So if you had several high IO tables, they would all compete. If we pull each clustered index out into it's own file and could put it on separate spindles, then we wouldn't have competing IO at the disk level.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now