?
Solved

how does sql insert data into indexes?

Posted on 2011-03-25
15
Medium Priority
?
261 Views
Last Modified: 2012-05-11
We have a high IO database with a table that has a clustered index and 3 non clustered indexes.

when data is inserted into the table (clustered index), does it try to write the columns for each non clustered index at the same time or sequentially? can you point to an article talking about this?

reason I ask is we had a 6 disk raid10 holding all 3 non clustered indexes that had an average disk second per transfer of 120 ms. we separated the set into 3 raid1 sets (2 disks per) and put one non clustered index on each set. the ADs/T for each drive is around 40 ms. I'm wondering if the whole transaction that updates all 3 indexes if it still takes 120ms to complete or if it's done in parallel.

yes I know 40ms is a bad ADs/T to get. Considering we were in the 200's originally, getting down to 40 is peachy for us, though of course we're looking to improve.
0
Comment
Question by:MrVault
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 4
  • 2
  • +1
15 Comments
 
LVL 40

Accepted Solution

by:
lcohan earned 2000 total points
ID: 35218982
I guess the articles below explains in detail how they function:

http://www.sqlteam.com/article/sql-server-indexes-the-basics
http://www.sqlskills.com/BLOGS/KIMBERLY/category/Clustering-Key.aspx

As far as performance goes here's what I did for our 24/7 SQL db about 1.7TB in size - split all NONCLUSTERED indexes on separate filegoup/physical file, and changed indexes on very large tables that had the clustered column in it to the INCLUDE. This way we notice inserts/maintenance and IO much better and the queires suffered minimal and not noticeable impact.

For your second quetion I think they are done in parralell so is 40 in total best case scenario not 40+40+40
0
 

Author Comment

by:MrVault
ID: 35232136
I'm not sure I understand what you mean abou tthe "include" thing. How did you measure file IO without each filegroup file being on it's own separate spindle?

any chance you know of a document that goes over the parallel topic?
0
 
LVL 40

Expert Comment

by:lcohan
ID: 35233394
Assume Customers table is CLUSTERED on StoreId so when you create an index you can do it like:

CREATE NONCLUSTERED INDEX [idx_Customers] ON [dbo].[Customers]
(Customerid,StoreId)
WITH (FILLFACTOR = 80, ONLINE=ON, SORT_IN_TEMPDB = ON, MAXDOP = 1)  ON [INDEX];

--OR like below and I'd choose second option:
CREATE NONCLUSTERED INDEX [idx_Customers] ON [dbo].[Customers]
(Customerid)
INCLUDE (StoreId)
WITH (FILLFACTOR = 80, ONLINE=ON, SORT_IN_TEMPDB = ON, MAXDOP = 1)  ON [INDEX];

Of course the most accurate is to have each file on its own spindle however you can measure IO by using Windos Perfmon - Logical Disk counters.

Especialy we have the benefit of using SQL 2008 for index parallel operations please see:
http://msdn.microsoft.com/en-us/library/ms191292.aspx
0
Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

 

Author Comment

by:MrVault
ID: 35233796
with logical disk counters that would assume we partitioned the same volume to separate ones for each file though right?
0
 
LVL 40

Expert Comment

by:lcohan
ID: 35234365
Right - and of course it is not as acurate as having a phisical file on its own RAID/Spindle but you could stil somwhat get an idea of the IO if each file is placed in its own logical even though the logicals are on same raid.
0
 

Author Comment

by:MrVault
ID: 35300241
thanks. doesn't that article though talk about actions against the same index in parallel? I'm trying to see if a key is in multiple indexes that you delete/update/add/remove if it does the action to all indexes at once. maybe I'm misreading it?
0
 

Author Comment

by:MrVault
ID: 36279509
just wondering if you have an asnwer to this last question. If there are 4 non clustered indexes on one table, each with varying columns but each all having one of their columns being the same (say StoreID for example) and an operation is done where each index would need to be udpated (add/delete etc), will it try to do those operations sequentially or in parallel? Meaning update index1 first, then 2, 3, and 4, or if they're all on separate disks update them all at the same time.
0
 
LVL 47

Expert Comment

by:David
ID: 37021573
Technically it must be sequentially because of journaling, but real-world, this can happen so quickly (especially with write-back caching) that it can seem it is in parallel.
0
 

Author Comment

by:MrVault
ID: 37021597
Can you explain the journaling piece?
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 37022911
I don't agree that journaling is a reason for changing the indexes sequentially. There is no reason to serialize the update of non-clustered indexes here. Whenever a clustered index is to be updated, that one needs to be done first, since the other indexes contain pointers to that index. If only non-clustered indexes are to be changed, that can be well done in parallel. Whether it is done in parallel or not is unknown by me, but I assume the former.
0
 
LVL 40

Expert Comment

by:lcohan
ID: 37025061
In my opinion you should never split your CLUSTERED index from the table DATA and you should always create them on the same FILEGROUP where your table is located.
NONCLUSTERED indexes - different story. I'll try find Microsoft article to support my first statement.
0
 
LVL 70

Expert Comment

by:Qlemo
ID: 37025172
??? We are speaking of MSSQL server here, where a Clustered Index is automatically determining the physical storage structure of the table, since it is the table.
0
 

Author Comment

by:MrVault
ID: 37063598
But isn't the file where the table is located the main MDF file by default? So if you had several high IO tables, they would all compete. If we pull each clustered index out into it's own file and could put it on separate spindles, then we wouldn't have competing IO at the disk level.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have a large data set and a SSIS package. How can I load this file in multi threading?
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question