Link to home
Create AccountLog in
Avatar of Maven 0001
Maven 0001

asked on

Occasionally Simple insert timeout with record inserted and update takes 10+ seconds on a table, rest of the times both operation takes less than 1 second.

As mentioned above i have situation where a simple inserts timeout. The table has 500000 records and inserts/update via procedure.

Insert is like

insert into table1
(column1,column2,entrydate,updatedate,status)
('','',getdate(),getdate(),'active')
Above statement times out.

update table table1
set column1=''
where column2='XYZ'
Above statements timesout or takes time between 5-25 seconds

Strange thing it works well for 98% of the time less than second.
Avatar of Pawan Kumar
Pawan Kumar
Flag of India image

Run this command. Seems like stats were not updated.

 ALTER INDEX ALL ON tableName REBUILD; 

Open in new window

Avatar of Nakul Vachhrajani
Basic question: Do you have a clustered index on your table?

If the table is a heap (i.e. no clustered index exists on the table), other connections trying to insert/update would lock the entire table and your connection would need to wait to get an exclusive lock to insert/update.

If missing, my recommendation would be to identify a suitable key and create a clustered index on the table.
How many inserts/updates are occurring in that table at same time?
There's any index on column2 for improve the speed of the update?
make sure you have index
for the  table1 table column column2
and there are no blocks when you insert -update
Also, review the log file.  SQL must log all modifications (DELETE & INSERT & UPDATE).  If the log file needs more space, it must grow dynamically, which causes all all  activity on the db to be paused while the new space is fully formatted.

1) Do you shrink the log file regularly? :: That's a terrible thing!
2) What is the log file size?  
3) What is the log file growth amount?  Is it a fixed value or a percent(not good)?
4) How often does the log file expand, and is expansion slow?  You can use "Standard Reports" in SSMS to see autogrowth activity, or I believe sp_blitz will show you that as well, and if it was slow or not.
Avatar of Maven 0001
Maven 0001

ASKER

Thank guys for all your replies, i will answer one by one. My idea is to first understand what is causing the problem and than go for the fix. Thanks again.
@pawan:

do we still need to rebuild if the insert/update take less than second after the timeout. i want to fully understand the situation/symptoms before taking corrective measures.
Below are the timestamps which timeout, rest of the time it took less than 1 second.
2016-10-08 11:21:12.207
2016-10-07 17:38:36.353
2016-10-07 17:32:29.133
2016-10-07 17:32:27.703
2016-10-07 17:32:25.670
2016-10-07 17:32:12.883
2016-10-07 17:21:21.077
2016-10-07 17:05:06.987
2016-10-07 17:05:06.653
2016-10-07 17:05:01.737
2016-10-07 11:22:18.933
2016-10-06 19:52:42.730
All are insert timeout with no record inserted in table, no update timeout yesterday.
@Nakul:
Basic question: Do you have a clustered index on your table?
No, there is no clustered index but 6 non-unique, non clustered index on the table.
We have identity column set in this table and a unique key is made of it and used in every select statement.
For example.
We add a new record with the below statement.
insert into table1
(entrydate,updatedate,status) values
(getdate(),getdate(),'active')

Read the identity value
set @id=@@identity
update table1
set refid='ABC-' + @id
where id=@id

now this refid is the key on which various index are made as it is used to fetch/update most of the information.
This table has 90+ column, updated at various levels in various processes. None takes time, only on the update of column which are part of index some time take time.  All the columns are initially null and updated only once and never replaced.
@victor
How many inserts/updates are occurring in that table at same time?
Max 20-25 insert/update per minute , concurrent 3-4 at max
There's any index on column2 for improve the speed of the update?
Yes, it is
@eugene
make sure you have index
for the  table1 table column column2
and there are no blocks when you insert -update

We do have index on column2. Now, please suggest how can make sure there is no block in insert/update.
@scott:

Also, review the log file.  SQL must log all modifications (DELETE & INSERT & UPDATE).  If the log file needs more space, it must grow dynamically, which causes all all  activity on the db to be paused while the new space is fully formatted.

1) Do you shrink the log file regularly? :: That's a terrible thing!
No, we do not shrink the file.
2) What is the log file size?  
Currently 14 Mb
3) What is the log file growth amount?  Is it a fixed value or a percent(not good)?
Its default settings. 10 percent, unlimited
4) How often does the log file expand, and is expansion slow?  You can use "Standard Reports" in SSMS to see autogrowth activity, or I believe sp_blitz will show you that as well, and if it was slow or not.
i cannot see such report, please guide us there.
In this case you should have a clustered index on the ID column. Your table is a heap, which means that updates would be locking a major part (if not the entire) of the table.

Create a clustered index and let us know the changes to performance characteristics.
One more thing sometime select statement also give deadlock 57 error for brief moment of time.
@nakul:

Is there a way we can recreate the situation on a copy of database and put it into some tests. as of now it is random and already the structure, indexes are 4 year old.
Sure. It would be crazy to directly test on production. You should always test changes on a staging environment first.

All your symptoms (the deadlocks, occasional delay in performance, etc) point to the table being locked and insert/updates/selects overlapping with each other.

The clustered index, by definition will prevent unnecessary locking on the entire table.
@Author - If DML operations are working fine then it means you have updated stats and fragmentation is also under control.

Stats updation and index maintenance are part of DBA activities they perform weekly/biweekly/daily depending on the  reqs.
@pawan

This is what i meant also :)

It happens occasionally and reverts back to normal so currently i am ruling out this option.

Thanks for your suggestion. i am exploring other options.
So are you guys dont have maintenance plan in place for your DB?

If Not , then you can embed Rebuild/Update Stats command in your stored proc itself if that is taking few seconds.

Also check when these queries takes time, do we some other process or some long running queries were there ?
Agree with Pawan. Do check your maintenance routines.

But that alone will not resolve your long term issue. The simple reason your updates are taking time is because the queries must be needing scan operations (you can verify this by using the execution plans). In the absence of a clustered index, there is no logical order in the table - which means that if you update row #1 and row #10000, in both cases SQL Server will have to begin from the first row to figure out where the rows are.

Having a clustered index is the only real fix to speeding your system up and eliminating these errors.

I will see if I can get an example of this situation ready to demonstrate.
@pawan

No we do not have maintenance plan only we do auto backup every 8 hours via sql script. We shrink the data file and log file bi monthly. We do not have full time DBA nor developers with the complete DBA skills.

I suspect the long running queries too but the duration is too short to actually see what is causing the problem. it happens random and there is no pattern to recreate but happens daily.

i agree with @nakul that clustered index would speed up the process but before implementing needed to be sure pros and cons.
Even if you go with Clustered Index then also you need to maintain Clustered Index and updated stats.

The best option you have is Include the Update Stats and Rebuild Index command in your stored procedure that inserts or updates.


This will give you same time each time.
I think you have a couple of other problems as well. Why do you need to shrink your DB and log files? Doing so increases fragmentation.

A fragmented heap is a performance killer - do you by any chance see these "sporadic" locking issues around the time you shrink the DBs?
what is your sql server version -edition -service pack (should be the latest SP)?
How big is your table (rows-size)?
How big will it be in 1 month- 1 year?
Do you purge archive data from this table?
etc

<lease suggest how can make sure there is no block in insert/update.>

INF: Understanding and resolving SQL Server blocking problems
https://support.microsoft.com/en-us/kb/224453

---




you can try to use Snapshot Isolation in SQL Server that may help you to address blocking -deadlocks in your case without changing your process -code
/Note: you need to test and make sure this option is good for your process)
ALTER DATABASE YourDB SET READ_COMMITTED_SNAPSHOT ON
https://msdn.microsoft.com/en-us/library/tcbchxcb(v=vs.110).aspx
The current log is only 14MB and has not been shrunk.  Hmm, a bit unusual, but OK.

Change the log file growth setting to 1MB or 2MB, a fixed amount, rather than a percentage.  A percentage if often much slower in SQL Server (although I'm not sure why).

Also, let's try pre-allocating extra space to the log file and see if the delays go away for a while.

Step1)
USE <your_db_name>
EXEC sys.sp_helpfile
--Copy the name in the first column for the log file (file#2) into the command below:

Step2)
ALTER DATABASE <your_db_name>
MODIFY FILE ( NAME = <logical_log_file_name_from_Step1>, SIZE = 24MB );
@Author - What kind of access you have on the DB system? Are you a DB owner ?
<we do not have maintenance plan only >
1st - if your DB is in Full recovery ( check DB properties)
 set trans log backups (every 30 -60 mins)
Transaction Log Backups
https://msdn.microsoft.com/en-us/library/ms191429.aspx

2st set DB maints ( check the besgt day-time)(
for Reindex, Check DB, Update stats, etc
more

Use the Maintenance Plan Wizard
https://msdn.microsoft.com/en-us/library/ms191002.aspx

3rd - see my last post
No, there is no clustered index but 6 non-unique, non clustered index on the table.

This means that each insert will also insert 6 rows for indexes.

We have identity column set in this table and a unique key is made of it and used in every select statement.
That's not the best option. My recommendation is for a Constraint solution where the unique key will be automatically generated and no need at all for the update command:
ALTER TABLE table1 DROP COLUMN refid;
GO

ALTER TABLE table1 ADD refid AS ('ABC-' + id)

Open in new window

Btw, there are also triggers on this table?


Its default settings. 10 percent, unlimited
Never set autogrow in percentage. Tomorrow you'll have a transaction log file with 100GB and will grow 10GB. Best practice is to set a static value (100MB for example). So you know exactly how much it will grow every time.


No we do not have maintenance plan only we do auto backup every 8 hours via sql script. We shrink the data file and log file bi monthly. We do not have full time DBA nor developers with the complete DBA skills.
Hope so, because a DBA that don't configure maintenance tasks and shrink data files is not a good DBA.
First thing to do ASAP is to create maintenance tasks for reindexes and stop shrinking files. You can use Ola Hallengren scripts. No need to be a DBA to configure it. Just download and execute the script in your SQL Server instance and then schedule the created jobs.
Thanks guys for all your suggestions. i have got various inputs and all seems logical. After reading all the comments i guess we need to review our backup and maintenance plan.

i will update with more details shortly.
Not just your maintenance plans - revisit your table designs too. Heaps cause unnecessary blocking on your transactions and higher fragmentation of your tables due to row forwarding.
Yes starting from the table design, Indexing, maintenance plan , SQl Server basic setting etc.
@Nakul

Sure we are going to do that also. i am collecting all the suggestions and work out a plan and post it here before implementing the same.

@everyone
Currently i am going to share some points for why we shrink log files and data files. Actually we have put various counters/stats in the same db to track user activity. Now when we started getting timeout we started purging the data, removing unnecessary data which increases log file size and free space in data files. Now we have understood it is not good practice. We will review the same.

Thanks for all the advices.
@pawan

Sure, it seems lots of work but thanks fully we now have good people to get advice from :)

@eugene,vicotr,scott

Thanks guys for your replies, i will be reverting back with more details
Currently i am going to share some points for why we shrink log files and data files.
There is NO justification for shrinking in a periodically basis. In very rare occasions you shrink is an acceptable operation (during emergencies) but then a full database reindex need to be applied immediately after the shrink operation.
Btw, you don't have the auto shrink ON, right?
@victor
Thanks for sharing the information, we too understand now it is not needed.
Good.
The timeout issue returned?
Yes, still it is timing out. It seems some queries are running slow which are locking the table for some time. Last 2 days logs.
2016-10-18 16:36:18.667
2016-10-18 16:32:45.700
2016-10-18 16:30:36.780
2016-10-18 16:30:30.720
2016-10-18 16:28:08.820
2016-10-18 16:27:45.047
2016-10-18 16:27:45.000
2016-10-18 16:26:49.187
2016-10-18 16:26:32.823
2016-10-18 16:23:46.607
2016-10-18 16:21:16.850
2016-10-18 16:17:32.140
2016-10-18 16:17:20.820
2016-10-18 16:17:12.003
2016-10-18 16:17:02.720
2016-10-18 16:15:39.557
2016-10-18 16:15:13.833
2016-10-18 16:14:34.417
2016-10-18 16:14:23.967
2016-10-18 16:14:16.753
2016-10-18 16:03:09.620
2016-10-18 15:18:29.153
2016-10-18 14:28:38.340
2016-10-18 14:28:36.253
2016-10-18 12:53:33.177
2016-10-17 18:10:11.320
2016-10-17 18:08:40.713
2016-10-17 18:07:35.583
2016-10-17 18:05:15.530
2016-10-17 17:56:19.597
2016-10-17 17:54:12.313
2016-10-17 17:51:12.927
2016-10-17 17:50:20.313
2016-10-17 17:48:14.893
2016-10-17 17:48:12.613
2016-10-17 17:48:01.577
2016-10-17 17:48:00.210
2016-10-17 17:47:57.643
2016-10-17 17:47:44.417
2016-10-17 17:47:42.173
2016-10-17 17:47:38.813
2016-10-17 17:47:31.527
2016-10-17 17:47:23.203
2016-10-17 17:47:09.177
2016-10-17 17:47:07.550
2016-10-17 16:40:18.363
2016-10-17 16:39:48.177
2016-10-17 16:29:18.570
2016-10-17 16:26:58.083
2016-10-17 16:25:34.640
2016-10-17 16:25:25.157
2016-10-17 16:25:14.143
2016-10-17 16:03:48.987
2016-10-17 15:53:49.737
2016-10-17 15:53:44.573
2016-10-17 15:53:44.233
2016-10-17 15:53:39.633
2016-10-17 15:28:20.650
2016-10-17 14:54:53.070
2016-10-17 14:28:39.603
2016-10-17 12:04:53.787
2016-10-17 11:28:26.393
2016-10-17 10:36:19.137
2016-10-17 10:36:17.860
What are those date and times? The lock occurrences?
Yes there are the timings when the insert/update timeout.
Latest findings is that update is very much controlled and insert procedure takes time due to a query which preceedes to check duplicate. i guess this is timing out the procedure.
Possible, if the duplicate check locks up the entire table.

Again, can you please work to evaluate the addition of appropriate clustered indexes? Also, see if your non-clustered indexes cover the columns on which you perform duplicate checks (e.g. a person full name, account name, etc). By doing so, you shift the locks from the table to your indexes and your table - making it a much faster operation.
Do you have UNIQUE constraints on the duplicate check columns?
insert procedure takes time due to a query which preceedes to check duplicate
If your table has a Primary Key (PK) this check not necessary. A PK guarantee the uniqueness of a record and only thing you need to do is to handle with that error:
BEGIN TRY 
    insert into table1 (column1,column2,entrydate,updatedate,status)
    values ('','',getdate(),getdate(),'active')
END TRY 
BEGIN CATCH 
    IF @@Error = 2627
    BEGIN
        -- Do here what you need to do for duplicate records
    END
END CATCH

Open in new window

This way you'll save a SELECT and eventual locks associated.
@victor:

We are not only checking duplicates but assigning groupnumber to the record as well.

I think we are narrowed down to culprit queries as the symptoms always shows that either insert do not happen due to first query timing out or subsequent update do not happen due to second query timing out.

-- Check for column1 and column2 and see if tried within 30 minutes.
--First select query
select @tempid=id from tab_carts where column1='' and column2='' and datediff(minute,entrydate,getdate())<=30                
                

 -- Check whether it exists or not.                                                                          

 if @tempid is null                                                                          

 begin                                     

  -- since it is not present generate one.                   

  insert into table1                                                                     

  (column1,column2,entrydate,updatedate,status)                                                  

  values ('','',getdate(),getdate(),'active')                         

                                              

  set @tempid =@@identity                                      
--Second select query
  select @groupcolumn=groupcolumnrefno from tab_carts where column1=@column1 and status='active'                      

  if @groupcolumn is null                      

  begin                      

   set @groupcolumn='BID-' +  some uniqueval + convert(varchar(100),@tempid)                      

  end                 

    end                

 else                

 begin                

   set @tempid =0                

 end                                                                          

 update table1
 set column11='BID-' +  some uniqueval + convert(varchar(100),@tempid) 
 groupcolumnrefno=@groupcolumn
 where id=@tempid

Open in new window


it gets timeout mostly on first select query and sometime on second query.
Are column1 and column2 indexed? If not you should create an index on those columns to speed up the SELECT command:
CREATE INDEX idx_tab_carts ON tab_carts (column1, column2)

Open in new window

After creating the index try to run again the query to see if the timeouts gone.
Can you share your execution plan?

Below should be time consuming as it will lead to scan. Could you please run the first query and give me the timings.

select @tempid=id from tab_carts where column1='' and column2='' and datediff(minute,entrydate,getdate())<=30  

Also try below..

SET @tempid = convert(varchar(100),@tempid)

--

update table1
 set column11='BID-' +  some uniqueval + @tempid
 groupcolumnrefno=@groupcolumn
 where id=@tempid

--

Open in new window

@pawan/victor

Execution plan completes the query in less than second but reports missing index.

The Query Processor estimates that implementing the following index could improve the query cost by 99.9911%.

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[table1] ([column1])

and this column1 is common both first query and second query.

But my question is than why it works almost all the time and do not work for brief moments everyday.

i already have six index on the table is this right to create one more index?
Infact there are 8 index and made when queries ran slow on basis of execution plan.

I see 1 or 2 are almost similar with extra include column. Can all be combined.
SOLUTION
Avatar of Nakul Vachhrajani
Nakul Vachhrajani
Flag of India image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Do you have maintenance plan in place for Stats and Indexes ? I think fragmentation will be the issue.

Also create the missing index and see how much your query improves.

CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[table1] ([column1])
@pawan:

No we do not have maintenance plans for stats and indexes but for past few days the select queries on the table taking more time for same queries. Can we do something on this table to see all in place.

Currently for Data files: Allocated space is 4805 mb and free space is 1035
Transaction log file: Allocated space is 417 and free space 400 mb.

In this condition we generally shrink the files and shrink the db.  when we shrink the data files nothing happens and we have to shrink the db.

while shrink db everything goes fine and it gets stuck at least once with some error some deadlock or something and on retrying shrink goes fine.

After that generally the timeouts decreases and resurfaces after 15-20 days.

Please guide.
Shrinking increases fragmentation in a database. Why do you need to shrink the DB anyway? If log and differential/full backups are running as expected, shrinking should not be required.

An approximately 4-4.5GB database is not big by any means and with the proper database design, this should be lightning fast.

I request once again - can you please review adding clustered indexes on your tables?
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
@pawan

Thanks for the detailed information, i will give the queries try. What is the best time to run the maintenance queries.

We are not shrinking our db anymore :)
And if you're trying to run the query when the db is being shrink this can justifies the locks that you're facing.
>> What is the best time to run the maintenance queries.

Weekends or when the minimum numbers of users gets impacted.

Hope it helps.
SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Hi,
Any update on this?

Regards,
Pawan
Thanks victor/pawan for your suggestions. We will give Ola Hallengren's solution a try.
Maven, did you use Ola's solution?
If affirmative, did it help?
Hi Author,
Any luck with this?

Regards,
Pawan
Dear pawan/vitor

We found out most of the time timeout happens due to slow running queries. There were some queries in the project was taking more than 5 seconds and eventually timing out sometimes which causes lock on the table and all further operations getting timed out. We are trying to remove the slow running queries and situation is improving. as this is production so we are taking extreme care and replacing queries.

Thanks for all your support.
ASKER CERTIFIED SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Thanks everyone for the help! indeed pleasure learning lot from you :)