jay-are
asked on
Execute Task & Deadlock
Hi Experts,
I have one SSIS package for 7 clients. The package needs to run for all 7 at once. They all have an execute task that updates the same table. The update has a where clause with a CompanyID variable. So the task only updates the client it is being executed for. I actually update a few fields, but they are all based on CompanyID so I don't mix anything up.
This works fine when I run them one at a time. When I run all 7 at once I get frequent deadlocks. I've tried creating indexes based on the particular fields I'm updating. I'm not sure if that was the right choice. I've also tried changing the IsolationLevel for the entire package and the specific execute task that is causing the deadlock. I'm unable to find the right combination that will stop the deadlocks. Is this not possible?
How do I get this SSIS to complete without deadlocks?
Using: SQL Server 2008 R2
I have one SSIS package for 7 clients. The package needs to run for all 7 at once. They all have an execute task that updates the same table. The update has a where clause with a CompanyID variable. So the task only updates the client it is being executed for. I actually update a few fields, but they are all based on CompanyID so I don't mix anything up.
This works fine when I run them one at a time. When I run all 7 at once I get frequent deadlocks. I've tried creating indexes based on the particular fields I'm updating. I'm not sure if that was the right choice. I've also tried changing the IsolationLevel for the entire package and the specific execute task that is causing the deadlock. I'm unable to find the right combination that will stop the deadlocks. Is this not possible?
How do I get this SSIS to complete without deadlocks?
Using: SQL Server 2008 R2
Hey,
you can use UPDLock table hint to prevent deadlocks
see below link
http://www.sqlnotes.info/2012/10/10/update-with-updlock/
you can use UPDLock table hint to prevent deadlocks
see below link
http://www.sqlnotes.info/2012/10/10/update-with-updlock/
ASKER
Scott,
I created the index on my temp table based on what I have read. It didn't make any difference, I think it actually hurt performance. I definitely did not create a clustered index. I'm basically throwing changes at this package to see what works. This is my first time trying to execute a package with multiple clients simultaneously. I executed your query.
&
I doubt that will format correctly. Might be easier to copy/paste into excel. This loading table can have as little as 0 rows and as much as 100k rows. Depends on the client. CompanyID is just a way to identify that specific row with a client. There are 7 distinct clientIDs.
Just out of curiousity. The execute task that is causing the problem contains an expression for a few different update statements. Here:
Should I separate these into individual tasks?
I created the index on my temp table based on what I have read. It didn't make any difference, I think it actually hurt performance. I definitely did not create a clustered index. I'm basically throwing changes at this package to see what works. This is my first time trying to execute a package with multiple clients simultaneously. I executed your query.
capture_date Db_Name row_count Table_Name equality_columns inequality_columns included_columns user_seeks user_scans max_days_active unique_compiles last_user_seek last_user_scan avg_total_user_cost avg_user_impact system_seeks system_scans last_system_seek last_system_scan avg_total_system_cost avg_system_impact statement object_id index_handle
2014-07-30 10:38:21.693 CLOC 2127506 gldetailload NULL [TransactionAmount] NULL 1 0 1 1 2014-07-30 02:45:53.213 NULL 25.5610632031296 95.04 0 0 NULL NULL 0 0 [CLOC].[dbo].[gldetailload] 821577965 85731
2014-07-30 10:38:21.693 CLOC 2127506 gldetailload [LastModified] NULL NULL 2858 0 1 5 2014-07-30 02:45:00.073 NULL 24.9434831542073 99.73 0 0 NULL NULL 0 0 [CLOC].[dbo].[gldetailload] 821577965 22002
2014-07-30 10:38:21.693 CLOC 2127506 gldetailload [MADate] NULL NULL 2858 0 1 2652 2014-07-30 02:44:59.673 NULL 27.7363369582885 98.47 0 0 NULL NULL 0 0 [CLOC].[dbo].[gldetailload] 821577965 21998
2014-07-30 10:38:21.693 CLOC 2127506 gldetailload [uid] NULL NULL 2859 0 1 32 2014-07-30 02:45:01.473 NULL 31.0946669895849 98.68 0 0 NULL NULL 0 0 [CLOC].[dbo].[gldetailload] 821577965 22000
&
row_num db_name index_name uniq? table_name index_id row_count key_cols nonkey_cols user_seeks user_scans user_lookups user_updates last_user_seek last_user_scan last_user_lookup last_user_update fk_ref_count max_days_active filegroup_name system_seeks system_scans system_lookups system_updates last_system_seek last_system_scan last_system_lookup last_system_update
15 CLOC NULL N.N gldetailload 0 2127506 NULL NULL 0 11951 3191 34599 NULL 2014-07-30 02:46:42.590 2014-07-30 02:44:59.833 2014-07-30 02:45:53.547 NULL 1 PRIMARY 0 211 0 0 NULL 2014-07-30 02:45:53.363 NULL NULL
15 CLOC IX_gldetailload N.N gldetailload 13 2127506 CompanyNumber, id, aid, uid, MADate, LastModified, TransactionAmount, TransNumber, TranSeqNumber NULL 2874 11439 0 17641 2014-07-30 02:45:02.383 2014-07-30 02:45:53.213 NULL 2014-07-30 02:45:53.213 NULL 1 PRIMARY 0 42 0 0 NULL 2014-07-30 02:42:15.017 NULL NULL
I doubt that will format correctly. Might be easier to copy/paste into excel. This loading table can have as little as 0 rows and as much as 100k rows. Depends on the client. CompanyID is just a way to identify that specific row with a client. There are 7 distinct clientIDs.
Just out of curiousity. The execute task that is causing the problem contains an expression for a few different update statements. Here:
"
update gldetailload set transactionamount = convert(money,transactionamount) where uid is null and companynumber = '" + @[User::CoID] + "';
go
update gldetailload set madate = '" + @[User::MADate] + "' where madate is null and companynumber = '" + @[User::CoID] + "';
go
declare @EC varchar(10)
set @EC = '" + @[User::EnterpriseCode] + "'
declare @CO varchar(10)
set @CO = '" + @[User::CoID] + "'
update gldetailload set uid = + @EC + ':' + @CO + ':' + [accountnumber] + ':' + convert(varchar,[transactionamount]) + ':' + [transnumber] + ':' + [transeqnumber] where uid is null and companynumber = '" + @[User::CoID] + "';
go
update gldetailload set lastmodified = getdate() where lastmodified is null and companynumber = '" + @[User::CoID] + "';
go
"
Should I separate these into individual tasks?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Scott,
I hadn't realized there was that much data in that table already. It is accounting history.
The "id" column shouldn't be in that index actually. I don't use it. I created the index in a hurry and threw in a few columns just in case. This table does not contain an identity field. The field "uid" is what I use as a unique identifier. I use the update statements to build "uid".
I truncated the table and dropped the existing index. I used your code and simply replaced "id" with "uid". I did not create a nonclustered index for "uid".
The ssis package is faster today leading me to another question. I went through around 25 of these temp/load tables yesterday creating nonclustered indexes. Then I started testing the SSIS packages. The server became a little sluggish and I was unable to right click/properties the [cloc] db. Did the indexes need time to build after I created them? Is there some way to query their current progress before I test my package out after creating the indexes you suggested?
I hadn't realized there was that much data in that table already. It is accounting history.
The "id" column shouldn't be in that index actually. I don't use it. I created the index in a hurry and threw in a few columns just in case. This table does not contain an identity field. The field "uid" is what I use as a unique identifier. I use the update statements to build "uid".
I truncated the table and dropped the existing index. I used your code and simply replaced "id" with "uid". I did not create a nonclustered index for "uid".
The ssis package is faster today leading me to another question. I went through around 25 of these temp/load tables yesterday creating nonclustered indexes. Then I started testing the SSIS packages. The server became a little sluggish and I was unable to right click/properties the [cloc] db. Did the indexes need time to build after I created them? Is there some way to query their current progress before I test my package out after creating the indexes you suggested?
>> I used your code and simply replaced "id" with "uid". <<
Whoa. You do not want to do that if the value for "uid" is updated / assigned later. Simply remove the column from the index:
CREATE CLUSTERED INDEX gldetailload__CL ON gldetailload ( CompanyNumber ) WITH ( FILLFACTOR = 96, SORT_IN_TEMPDB = ON ) ON [PRIMARY]
>> Did the indexes need time to build after I created them? <<
Yes.
>> Is there some way to query their current progress before I test my package out after creating the indexes you suggested? <<
Unless you built them using "ONLINE = ON", either explicitly or implicitly, the entire table will be locked until the index build is completed.
The first index created on any table (other than, perhaps, a purely staging table) should be a clustered index. Then, if needed, add nonclustered index(es).
Whoa. You do not want to do that if the value for "uid" is updated / assigned later. Simply remove the column from the index:
CREATE CLUSTERED INDEX gldetailload__CL ON gldetailload ( CompanyNumber ) WITH ( FILLFACTOR = 96, SORT_IN_TEMPDB = ON ) ON [PRIMARY]
>> Did the indexes need time to build after I created them? <<
Yes.
>> Is there some way to query their current progress before I test my package out after creating the indexes you suggested? <<
Unless you built them using "ONLINE = ON", either explicitly or implicitly, the entire table will be locked until the index build is completed.
The first index created on any table (other than, perhaps, a purely staging table) should be a clustered index. Then, if needed, add nonclustered index(es).
ASKER
Scott,
Thank you for your help. I have made the changes and removed uid from the clustered index (drop & recreate). I also created a nonclustered index for uid.
I will run the package for all clients now to test.
Thank you for your help. I have made the changes and removed uid from the clustered index (drop & recreate). I also created a nonclustered index for uid.
I will run the package for all clients now to test.
Great; let me know how it goes.
ASKER
Scott,
The package fails on the DFT now. The error:
"Cannot insert duplicate key row in object 'dbo.gldetailload' with unique index 'gldetailload__CL'.". This occurs before the Execute task. Is this because the "CompanyNumber" field isn't unique?
The package fails on the DFT now. The error:
"Cannot insert duplicate key row in object 'dbo.gldetailload' with unique index 'gldetailload__CL'.". This occurs before the Execute task. Is this because the "CompanyNumber" field isn't unique?
ASKER
Scott,
I removed the "Unique" check for that index. Package is running now.
I removed the "Unique" check for that index. Package is running now.
ASKER
Two of the seven clients have completed. No locks. The other are larger clients and will take another hour or so to complete.
While I wait to make sure they don't lock I will begin making these index modifications to my other load tables and testing those packages. I will report back and award points later. I appreciate the help!
While I wait to make sure they don't lock I will begin making these index modifications to my other load tables and testing those packages. I will report back and award points later. I appreciate the help!
You're welcome!
ASKER
Resolved the deadlocks!
Fantastic!
Just to be clear, because you'll hear otherwise from many SQL "experts":
The best clustered index is the single most important factor in overall table performance.
Their preferred method instead is to automatically cluster on identity and then build (gazillion) nonclustered "covering indexes" for (virtually) every query after the fact; that is just wrong. Instead, get the clustered index right, then build the limited number of nonclustered indexes that are actually required.
Just to be clear, because you'll hear otherwise from many SQL "experts":
The best clustered index is the single most important factor in overall table performance.
Their preferred method instead is to automatically cluster on identity and then build (gazillion) nonclustered "covering indexes" for (virtually) every query after the fact; that is just wrong. Instead, get the clustered index right, then build the limited number of nonclustered indexes that are actually required.
ASKER
I appreciate you taking the time to explain this to those less-informed (me)!
I need to re-evaluate my other existing indexes I suppose. :)
I need to re-evaluate my other existing indexes I suppose. :)
How does the data in the table relate to CompanyID?
How many distinct values of CompanyID are in the table? How many total rows?
If you want to allow me to get more detailed, please replace the values in bold with your info, run the code below, and post the results. All indexes should always be considered, not one index at a time. Finally, reviewing missing index and index usage stats is the minimum to do before making index modifications.
USE [<your_db_name_here>]
SET DEADLOCK_PRIORITY LOW --probably irrelevant, but just in case
DECLARE @list_missing_indexes bit
DECLARE @table_name_pattern sysname
--NOTE: showing missing indexes can take some time; set to 0 if you don't want to wait.
SET @list_missing_indexes = 1
SET @table_name_pattern = '<your_table_name_here>'
--SET @table_name_pattern = '%' --'%'=all tables.
PRINT 'Started @ ' + CONVERT(varchar(30), GETDATE(), 120)
IF @list_missing_indexes = 1
BEGIN
SELECT
GETDATE() AS capture_date,
DB_NAME(mid.database_id) AS Db_Name,
dps.row_count,
OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
mid.equality_columns, mid.inequality_columns, mid.included_columns,
user_seeks, user_scans, ca1.max_days_active, unique_compiles,
last_user_seek, last_user_scan, avg_total_user_cost, avg_user_impact,
system_seeks, system_scans, last_system_seek, last_system_scan, avg_total_system_cost, avg_system_impact,
mid.statement, mid.object_id, mid.index_handle
FROM sys.dm_db_missing_index_de
CROSS APPLY (
SELECT DATEDIFF(DAY, create_date, GETDATE()) AS max_days_active FROM sys.databases WHERE name = 'tempdb'
) AS ca1
LEFT OUTER JOIN sys.dm_db_missing_index_gr
mig.index_handle = mid.index_handle
LEFT OUTER JOIN sys.dm_db_missing_index_gr
migs.group_handle = mig.index_group_handle
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = mid.object_id AND
dps.index_id IN (0, 1)
--order by
--DB_NAME, Table_Name, equality_columns
WHERE
1 = 1
AND mid.database_id = DB_ID() --only current db
AND OBJECT_NAME(mid.object_id)
--AND mid.object_id IN (OBJECT_ID('<table_name_1>
ORDER BY
--avg_total_user_cost * (user_seeks + user_scans) DESC,
Db_Name, Table_Name, equality_columns, inequality_columns
END --IF
PRINT 'Midpoint @ ' + CONVERT(varchar(30), GETDATE(), 120)
-- list index usage stats (seeks, scans, etc.)
SELECT
ius2.row_num, DB_NAME() AS db_name,
CASE WHEN i.name LIKE ca2.table_name + '%'
THEN '~' + SUBSTRING(i.name, LEN(ca2.table_name) + 1 +
CASE WHEN SUBSTRING(i.name, LEN(ca2.table_name) + 1, 1) = '_' THEN
CASE WHEN SUBSTRING(i.name, LEN(ca2.table_name) + 2, 1) = '_' THEN 2 ELSE 1 END
ELSE 0 END, 200)
ELSE i.name END AS index_name,
CASE WHEN i.is_unique = 0 THEN 'N' ELSE 'Y' END + '.' +
CASE WHEN i.is_primary_key = 0 AND i.is_unique_constraint = 0 THEN 'N' ELSE 'Y' END AS [uniq?],
ca2.table_name,
i.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
dps.row_count,
SUBSTRING(key_cols, 3, 8000) AS key_cols, SUBSTRING(nonkey_cols, 3, 8000) AS nonkey_cols,
ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
fk.Reference_Count AS fk_ref_count,
DATEDIFF(DAY, CASE WHEN o.create_date > ca1.sql_startup_date THEN o.create_date
ELSE ca1.sql_startup_date END, GETDATE()) AS max_days_active,
FILEGROUP_NAME(i.data_spac
ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
o.object_id = i.object_id
CROSS JOIN (
SELECT create_date AS sql_startup_date FROM sys.databases WHERE name = 'tempdb'
) AS ca1
CROSS APPLY (
SELECT OBJECT_NAME(i.object_id/*,
) AS ca2
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal > 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
ic.key_ordinal
FOR XML PATH('')
) AS key_cols (key_cols)
OUTER APPLY (
SELECT
', ' + COL_NAME(object_id, ic.column_id)
FROM sys.index_columns ic
WHERE
ic.key_ordinal = 0 AND
ic.object_id = i.object_id AND
ic.index_id = i.index_id
ORDER BY
COL_NAME(object_id, ic.column_id)
FOR XML PATH('')
) AS nonkey_cols (nonkey_cols)
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
dps.object_id = i.object_id AND
dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stat
ius.database_id = DB_ID() AND
ius.object_id = i.object_id AND
ius.index_id = i.index_id
LEFT OUTER JOIN (
SELECT
database_id, object_id, MAX(user_scans) AS user_scans,
ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+us
FROM sys.dm_db_index_usage_stat
WHERE
database_id = DB_ID()
--AND index_id > 0
GROUP BY
database_id, object_id
) AS ius2 ON
ius2.database_id = DB_ID() AND
ius2.object_id = i.object_id
LEFT OUTER JOIN (
SELECT
referenced_object_id, COUNT(*) AS Reference_Count
FROM sys.foreign_keys
WHERE
is_disabled = 0
GROUP BY
referenced_object_id
) AS fk ON
fk.referenced_object_id = i.object_id
WHERE
i.object_id > 100 AND
i.is_hypothetical = 0 AND
i.type IN (0, 1, 2) AND
o.type NOT IN ( 'IF', 'IT', 'TF', 'TT' ) AND
(
o.name LIKE @table_name_pattern AND
o.name NOT LIKE 'dtprop%' AND
o.name NOT LIKE 'filestream[_]' AND
o.name NOT LIKE 'MSpeer%' AND
o.name NOT LIKE 'MSpub%' AND
--o.name NOT LIKE 'queue[_]%' AND
o.name NOT LIKE 'sys%'
)
--AND OBJECT_NAME(i.object_id /*, DB_ID()*/) IN ('tbl1', 'tbl2', 'tbl3')
ORDER BY
--row_count DESC,
--ius.user_scans DESC,
--ius2.row_num, --user_scans&|user_seeks
db_name, table_name,
-- list clustered index first, if any, then other index(es)
CASE WHEN i.index_id IN (0, 1) THEN 1 ELSE 2 END,
key_cols
PRINT 'Ended @ ' + CONVERT(varchar(30), GETDATE(), 120)