[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

how does sql insert data into indexes?

Posted on 2011-03-25
15
Medium Priority
?
267 Views
Last Modified: 2012-05-11
We have a high IO database with a table that has a clustered index and 3 non clustered indexes.

when data is inserted into the table (clustered index), does it try to write the columns for each non clustered index at the same time or sequentially? can you point to an article talking about this?

reason I ask is we had a 6 disk raid10 holding all 3 non clustered indexes that had an average disk second per transfer of 120 ms. we separated the set into 3 raid1 sets (2 disks per) and put one non clustered index on each set. the ADs/T for each drive is around 40 ms. I'm wondering if the whole transaction that updates all 3 indexes if it still takes 120ms to complete or if it's done in parallel.

yes I know 40ms is a bad ADs/T to get. Considering we were in the 200's originally, getting down to 40 is peachy for us, though of course we're looking to improve.
0
Comment
Question by:MrVault
  • 6
  • 4
  • 2
  • +1
13 Comments
 
LVL 40

Accepted Solution

by:
lcohan earned 2000 total points
ID: 35218982
I guess the articles below explains in detail how they function:

http://www.sqlteam.com/article/sql-server-indexes-the-basics
http://www.sqlskills.com/BLOGS/KIMBERLY/category/Clustering-Key.aspx

As far as performance goes here's what I did for our 24/7 SQL db about 1.7TB in size - split all NONCLUSTERED indexes on separate filegoup/physical file, and changed indexes on very large tables that had the clustered column in it to the INCLUDE. This way we notice inserts/maintenance and IO much better and the queires suffered minimal and not noticeable impact.

For your second quetion I think they are done in parralell so is 40 in total best case scenario not 40+40+40
0
 

Author Comment

by:MrVault
ID: 35232136
I'm not sure I understand what you mean abou tthe "include" thing. How did you measure file IO without each filegroup file being on it's own separate spindle?

any chance you know of a document that goes over the parallel topic?
0
 
LVL 40

Expert Comment

by:lcohan
ID: 35233394
Assume Customers table is CLUSTERED on StoreId so when you create an index you can do it like:

CREATE NONCLUSTERED INDEX [idx_Customers] ON [dbo].[Customers]
(Customerid,StoreId)
WITH (FILLFACTOR = 80, ONLINE=ON, SORT_IN_TEMPDB = ON, MAXDOP = 1)  ON [INDEX];

--OR like below and I'd choose second option:
CREATE NONCLUSTERED INDEX [idx_Customers] ON [dbo].[Customers]
(Customerid)
INCLUDE (StoreId)
WITH (FILLFACTOR = 80, ONLINE=ON, SORT_IN_TEMPDB = ON, MAXDOP = 1)  ON [INDEX];

Of course the most accurate is to have each file on its own spindle however you can measure IO by using Windos Perfmon - Logical Disk counters.

Especialy we have the benefit of using SQL 2008 for index parallel operations please see:
http://msdn.microsoft.com/en-us/library/ms191292.aspx
0
Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

 

Author Comment

by:MrVault
ID: 35233796
with logical disk counters that would assume we partitioned the same volume to separate ones for each file though right?
0
 
LVL 40

Expert Comment

by:lcohan
ID: 35234365
Right - and of course it is not as acurate as having a phisical file on its own RAID/Spindle but you could stil somwhat get an idea of the IO if each file is placed in its own logical even though the logicals are on same raid.
0
 

Author Comment

by:MrVault
ID: 35300241
thanks. doesn't that article though talk about actions against the same index in parallel? I'm trying to see if a key is in multiple indexes that you delete/update/add/remove if it does the action to all indexes at once. maybe I'm misreading it?
0
 

Author Comment

by:MrVault
ID: 36279509
just wondering if you have an asnwer to this last question. If there are 4 non clustered indexes on one table, each with varying columns but each all having one of their columns being the same (say StoreID for example) and an operation is done where each index would need to be udpated (add/delete etc), will it try to do those operations sequentially or in parallel? Meaning update index1 first, then 2, 3, and 4, or if they're all on separate disks update them all at the same time.
0
 
LVL 47

Expert Comment

by:David
ID: 37021573
Technically it must be sequentially because of journaling, but real-world, this can happen so quickly (especially with write-back caching) that it can seem it is in parallel.
0
 

Author Comment

by:MrVault
ID: 37021597
Can you explain the journaling piece?
0
 
LVL 71

Expert Comment

by:Qlemo
ID: 37022911
I don't agree that journaling is a reason for changing the indexes sequentially. There is no reason to serialize the update of non-clustered indexes here. Whenever a clustered index is to be updated, that one needs to be done first, since the other indexes contain pointers to that index. If only non-clustered indexes are to be changed, that can be well done in parallel. Whether it is done in parallel or not is unknown by me, but I assume the former.
0
 
LVL 40

Expert Comment

by:lcohan
ID: 37025061
In my opinion you should never split your CLUSTERED index from the table DATA and you should always create them on the same FILEGROUP where your table is located.
NONCLUSTERED indexes - different story. I'll try find Microsoft article to support my first statement.
0
 
LVL 71

Expert Comment

by:Qlemo
ID: 37025172
??? We are speaking of MSSQL server here, where a Clustered Index is automatically determining the physical storage structure of the table, since it is the table.
0
 

Author Comment

by:MrVault
ID: 37063598
But isn't the file where the table is located the main MDF file by default? So if you had several high IO tables, they would all compete. If we pull each clustered index out into it's own file and could put it on separate spindles, then we wouldn't have competing IO at the disk level.
0

Featured Post

Free recovery tool for Microsoft Active Directory

Veeam Explorer for Microsoft Active Directory provides fast and reliable object-level recovery for Active Directory from a single-pass, agentless backup or storage snapshot — without the need to restore an entire virtual machine or use third-party tools.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Suggested Courses

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question