• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 269
  • Last Modified:

how does sql insert data into indexes?

We have a high IO database with a table that has a clustered index and 3 non clustered indexes.

when data is inserted into the table (clustered index), does it try to write the columns for each non clustered index at the same time or sequentially? can you point to an article talking about this?

reason I ask is we had a 6 disk raid10 holding all 3 non clustered indexes that had an average disk second per transfer of 120 ms. we separated the set into 3 raid1 sets (2 disks per) and put one non clustered index on each set. the ADs/T for each drive is around 40 ms. I'm wondering if the whole transaction that updates all 3 indexes if it still takes 120ms to complete or if it's done in parallel.

yes I know 40ms is a bad ADs/T to get. Considering we were in the 200's originally, getting down to 40 is peachy for us, though of course we're looking to improve.
0
MrVault
Asked:
MrVault
  • 6
  • 4
  • 2
  • +1
1 Solution
 
lcohanDatabase AnalystCommented:
I guess the articles below explains in detail how they function:

http://www.sqlteam.com/article/sql-server-indexes-the-basics
http://www.sqlskills.com/BLOGS/KIMBERLY/category/Clustering-Key.aspx

As far as performance goes here's what I did for our 24/7 SQL db about 1.7TB in size - split all NONCLUSTERED indexes on separate filegoup/physical file, and changed indexes on very large tables that had the clustered column in it to the INCLUDE. This way we notice inserts/maintenance and IO much better and the queires suffered minimal and not noticeable impact.

For your second quetion I think they are done in parralell so is 40 in total best case scenario not 40+40+40
0
 
MrVaultAuthor Commented:
I'm not sure I understand what you mean abou tthe "include" thing. How did you measure file IO without each filegroup file being on it's own separate spindle?

any chance you know of a document that goes over the parallel topic?
0
 
lcohanDatabase AnalystCommented:
Assume Customers table is CLUSTERED on StoreId so when you create an index you can do it like:

CREATE NONCLUSTERED INDEX [idx_Customers] ON [dbo].[Customers]
(Customerid,StoreId)
WITH (FILLFACTOR = 80, ONLINE=ON, SORT_IN_TEMPDB = ON, MAXDOP = 1)  ON [INDEX];

--OR like below and I'd choose second option:
CREATE NONCLUSTERED INDEX [idx_Customers] ON [dbo].[Customers]
(Customerid)
INCLUDE (StoreId)
WITH (FILLFACTOR = 80, ONLINE=ON, SORT_IN_TEMPDB = ON, MAXDOP = 1)  ON [INDEX];

Of course the most accurate is to have each file on its own spindle however you can measure IO by using Windos Perfmon - Logical Disk counters.

Especialy we have the benefit of using SQL 2008 for index parallel operations please see:
http://msdn.microsoft.com/en-us/library/ms191292.aspx
0
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

 
MrVaultAuthor Commented:
with logical disk counters that would assume we partitioned the same volume to separate ones for each file though right?
0
 
lcohanDatabase AnalystCommented:
Right - and of course it is not as acurate as having a phisical file on its own RAID/Spindle but you could stil somwhat get an idea of the IO if each file is placed in its own logical even though the logicals are on same raid.
0
 
MrVaultAuthor Commented:
thanks. doesn't that article though talk about actions against the same index in parallel? I'm trying to see if a key is in multiple indexes that you delete/update/add/remove if it does the action to all indexes at once. maybe I'm misreading it?
0
 
MrVaultAuthor Commented:
just wondering if you have an asnwer to this last question. If there are 4 non clustered indexes on one table, each with varying columns but each all having one of their columns being the same (say StoreID for example) and an operation is done where each index would need to be udpated (add/delete etc), will it try to do those operations sequentially or in parallel? Meaning update index1 first, then 2, 3, and 4, or if they're all on separate disks update them all at the same time.
0
 
DavidPresidentCommented:
Technically it must be sequentially because of journaling, but real-world, this can happen so quickly (especially with write-back caching) that it can seem it is in parallel.
0
 
MrVaultAuthor Commented:
Can you explain the journaling piece?
0
 
QlemoBatchelor, Developer and EE Topic AdvisorCommented:
I don't agree that journaling is a reason for changing the indexes sequentially. There is no reason to serialize the update of non-clustered indexes here. Whenever a clustered index is to be updated, that one needs to be done first, since the other indexes contain pointers to that index. If only non-clustered indexes are to be changed, that can be well done in parallel. Whether it is done in parallel or not is unknown by me, but I assume the former.
0
 
lcohanDatabase AnalystCommented:
In my opinion you should never split your CLUSTERED index from the table DATA and you should always create them on the same FILEGROUP where your table is located.
NONCLUSTERED indexes - different story. I'll try find Microsoft article to support my first statement.
0
 
QlemoBatchelor, Developer and EE Topic AdvisorCommented:
??? We are speaking of MSSQL server here, where a Clustered Index is automatically determining the physical storage structure of the table, since it is the table.
0
 
MrVaultAuthor Commented:
But isn't the file where the table is located the main MDF file by default? So if you had several high IO tables, they would all compete. If we pull each clustered index out into it's own file and could put it on separate spindles, then we wouldn't have competing IO at the disk level.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

  • 6
  • 4
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now