rizzalynn
asked on
Locking in SQL server
I have 2 batch programs, 1 is a program that sends email and another one sends fax. They both access a table QUEUE.
In the email sender program, this is what happens in relation to QUEUE. For each record in QUEUE that satisfies criteria:
1) Locks record 1 in QUEUE table
select 1 from QUEUE with (UPDLOCK) where id = 1
2) Process sending out of email
3) Delete record 1 in QUEUE table
delete from QUEUE where id = 1
4) commit transaction (transaction is not auto-commit)
In the fax sender program, similar steps also happen, except that in step 2, we send out fax (of course).
The problem is that sometimes the delete from QUEUE throws out an exception that it is locked. Thereby, re-sending of the emails/fax happens. I am sure that the group of records processed by these programs don't intersect.
Based on my testing, it seems that e.g. when email sender program performs step 1 and then fax sender program performs step 3, fax sender program raises a lock request time out (I perform this testing via simulation in dummy tables of the locking and db operations that happened in the programs).
I would like to know if there is something wrong with the way I lock the records that makes another transaction unable to delete records in the same table. Because the locking statement in the other transaction doesn't throw an error, yet the delete statement in that transaction throws an error.
By the way, here are some information about the database (I'm not sure if they will help):
Read Committed Snapshot is turned on
Snapshot Isolation State is On
In the email sender program, this is what happens in relation to QUEUE. For each record in QUEUE that satisfies criteria:
1) Locks record 1 in QUEUE table
select 1 from QUEUE with (UPDLOCK) where id = 1
2) Process sending out of email
3) Delete record 1 in QUEUE table
delete from QUEUE where id = 1
4) commit transaction (transaction is not auto-commit)
In the fax sender program, similar steps also happen, except that in step 2, we send out fax (of course).
The problem is that sometimes the delete from QUEUE throws out an exception that it is locked. Thereby, re-sending of the emails/fax happens. I am sure that the group of records processed by these programs don't intersect.
Based on my testing, it seems that e.g. when email sender program performs step 1 and then fax sender program performs step 3, fax sender program raises a lock request time out (I perform this testing via simulation in dummy tables of the locking and db operations that happened in the programs).
I would like to know if there is something wrong with the way I lock the records that makes another transaction unable to delete records in the same table. Because the locking statement in the other transaction doesn't throw an error, yet the delete statement in that transaction throws an error.
By the way, here are some information about the database (I'm not sure if they will help):
Read Committed Snapshot is turned on
Snapshot Isolation State is On
ASKER
I did some reading on SQL server and chanced upon sys.dm_tran_locks view. As supplementary info, I got the dm_tran_locks records for the database I have performed upon my testing on. I have 2 transactions, transaction1 which is a code snippet from my java program and transaction2 which are commands I executed in SQL server.
Upon start (nothing executed), there were only 2 entries with resource_type = DATABASE (I got 2 open SQL server screens) *start2 tab in excel file*.
I started my java program (transaction1) in debug mode and debugged in the indicated position *debug_pt in excel file*. There were now 3 entries with resource_type = DATABASE (additional 1 for the connection from the java program), and 1 entry each for resource_type equals to OBJECT, PAGE and KEY.
I run the first two lines in transaction2 *select_pt2 in excel file*. There were now 3 entries with resource_type = DATABASE and 2 entry each for resource_type equals to OBJECT, PAGE and KEY.
Then when I try to execute the delete statement in transaction2, the cursor goes into wait mode *delete_pt2 in excel file*. An additional 3rd entry with resource_type = KEY is added with request_status = WAIT.
Can someone kindly explain what happened? Is there any change that I must implement in my locking to have either both the select and delete in transaction 2 execute or have them both throw an error (assuming the select throws an error, then the delete should not happen)?
Upon start (nothing executed), there were only 2 entries with resource_type = DATABASE (I got 2 open SQL server screens) *start2 tab in excel file*.
I started my java program (transaction1) in debug mode and debugged in the indicated position *debug_pt in excel file*. There were now 3 entries with resource_type = DATABASE (additional 1 for the connection from the java program), and 1 entry each for resource_type equals to OBJECT, PAGE and KEY.
I run the first two lines in transaction2 *select_pt2 in excel file*. There were now 3 entries with resource_type = DATABASE and 2 entry each for resource_type equals to OBJECT, PAGE and KEY.
Then when I try to execute the delete statement in transaction2, the cursor goes into wait mode *delete_pt2 in excel file*. An additional 3rd entry with resource_type = KEY is added with request_status = WAIT.
Can someone kindly explain what happened? Is there any change that I must implement in my locking to have either both the select and delete in transaction 2 execute or have them both throw an error (assuming the select throws an error, then the delete should not happen)?
//transaction1 - java program
public static void main(String args[]) throws Exception{
Connection conn = Utils.getConnection();
try {
conn.setAutoCommit(false);
Utils.query(conn, "select 1 from QUEUE WITH (UPDLOCK) where id2 = 2", null);
conn.commit(); //debug in here *debug_pt*
} finally {
conn.close();
}
}
//transaction2 - MS SQL server
begin tran
select 1 from QUEUE (UPDLOCK) where id1 = 1 --execute until this for select_pt2
delete QUEUE where id2 = 1 --delete_pt2
rollback tran
dm-tran-locks.xls
Hi,
It looks like the SQL is waiting in the delete until the select from Java completes.
When you get to the point where the locks have a wait, then also look at sp_who and dollars to donuts you'll see in the blk column the ID of the java process. If you then go back to java and let it run, then the lock should come from wait to grant, the delete executes and the lock is released.
This is normal behaviour. The fact that your java program is held at a particular break-point is what is confusing things imho.
Anyway, see if what I've suggesed holds water.
Question: Why put the updlock on the java select? I've never needed to worry about adding my own locks in the past ...
HTH
David
It looks like the SQL is waiting in the delete until the select from Java completes.
When you get to the point where the locks have a wait, then also look at sp_who and dollars to donuts you'll see in the blk column the ID of the java process. If you then go back to java and let it run, then the lock should come from wait to grant, the delete executes and the lock is released.
This is normal behaviour. The fact that your java program is held at a particular break-point is what is confusing things imho.
Anyway, see if what I've suggesed holds water.
Question: Why put the updlock on the java select? I've never needed to worry about adding my own locks in the past ...
HTH
David
ASKER
It was a testing to see if indeed an UPDLOCK in one record (Java program) can block a delete operation in another record in the same table in another transaction (SQL server window) and based from the results, it seems like it.
What puzzles me is that the UPDLOCK (locking part) in the other transaction (SQL server window) can get through, while the delete is hindered in that same transaction. I was hoping that if the delete can't be done, then the locking part should have already thrown an error.
In the actual program (fax and email sender), locking the QUEUE record to be deleted is essential as its not acceptable to have the delete of the QUEUE record throw a lock exception after sending out an email/fax, as this will result to sending out of the same email/fax in the next run of the batch program.
I think my lock might be wrong. I just need a lock on QUEUE that will block all other transaction that would try to acquire a similar lock on a record I am already locking, but will allow DML operations on other QUEUE records.
I've tried ROWLOCK, but this doesn't work, it will just allow the delete of the RECORD I've locked in another transaction.
Thanks for the help and I hope this post clears out some things. =)
What puzzles me is that the UPDLOCK (locking part) in the other transaction (SQL server window) can get through, while the delete is hindered in that same transaction. I was hoping that if the delete can't be done, then the locking part should have already thrown an error.
In the actual program (fax and email sender), locking the QUEUE record to be deleted is essential as its not acceptable to have the delete of the QUEUE record throw a lock exception after sending out an email/fax, as this will result to sending out of the same email/fax in the next run of the batch program.
I think my lock might be wrong. I just need a lock on QUEUE that will block all other transaction that would try to acquire a similar lock on a record I am already locking, but will allow DML operations on other QUEUE records.
I've tried ROWLOCK, but this doesn't work, it will just allow the delete of the RECORD I've locked in another transaction.
Thanks for the help and I hope this post clears out some things. =)
Hi,
You could add a timestamp column.
select @ID = ID, @Timestamp = timestamp
from dbo.queue
where ...
update queue
set completed = 'Y'
where ID = @ID and timestamp = @timestammp
-- if another process has updated the row, then the next update will fail as the timestamp has changed. Catch this error, and retry from the top.
HTH
David
You could add a timestamp column.
select @ID = ID, @Timestamp = timestamp
from dbo.queue
where ...
update queue
set completed = 'Y'
where ID = @ID and timestamp = @timestammp
-- if another process has updated the row, then the next update will fail as the timestamp has changed. Catch this error, and retry from the top.
HTH
David
ASKER
Problem is in between the locking and the delete of the QUEUE record, the sending of the fax/email happens. If it was just purely db transactions, then it's easy to rollback them, but it just so happens that we can't rollback sending of email/fax =), so fax/email is sent more than necessary.
That's why I had to rely on good ol' locking, to make sure no one else has a lock on the record before sending out email/fax.
That's why I had to rely on good ol' locking, to make sure no one else has a lock on the record before sending out email/fax.
Hi,
I added another variable called a status. At the top of the procedure update the status to being instatus, with the timestamp that I showed you, and that will block any other processes grabbing that item.
When process is finished, then update status to complete or delete the item. I prefer to update it as completed.
Advantages - reduces the amount of time that locks are held to a minium. Lock is only held for the lenght of the individual updates. Generally the first may have problems, but second should _always_ succeed. Holding a lock for the length of time that you suggest is too long. You will get what you report above, and potentially block everything to a stand-still, or until time-outs occur.
HTH
David
I added another variable called a status. At the top of the procedure update the status to being instatus, with the timestamp that I showed you, and that will block any other processes grabbing that item.
When process is finished, then update status to complete or delete the item. I prefer to update it as completed.
Advantages - reduces the amount of time that locks are held to a minium. Lock is only held for the lenght of the individual updates. Generally the first may have problems, but second should _always_ succeed. Holding a lock for the length of time that you suggest is too long. You will get what you report above, and potentially block everything to a stand-still, or until time-outs occur.
HTH
David
ASKER
Take note that our connection is not in autocommit and we commit only for every record in QUEUE processed.
The background on why we put locks was because before we encountered locking exceptions on delete of QUEUE records. We thought it was other batch programs or perhaps the online screens that caused this. So we locked the record first before processing to prevent access of other connections to that record while the batch program is accessing it so that we can delete the QUEUE record without any problems (thinking that locking in SQL server is as simple as in Oracle SQL). The problem now is the repeating emails/fax still happens so we had to investigate on why. On the investigation, it seems that delete tries to get an Update (U) lock on other records in the table although only one record is to be deleted.
The implementation above implies that I commit the update on the status, timestamp of the QUEUE so other records can see this change in the status, timestamp. The problem is if other connections have held locks in the QUEUE records during my delete/update (this was the original problem, that's why I have resolved to putting locks), then I have to retry deleting/updating the record because an email/fax has already been sent for it. Rollback of the status to "queued" is out of the question as it will bring then result again to sending of email/fax more than once.
With that, I need to know if there's an option to make "delete" operation not acquire locks on records other than for the record it will delete. Because it seems that the problem is with the "delete" operation taking locks on other records in the table.
The background on why we put locks was because before we encountered locking exceptions on delete of QUEUE records. We thought it was other batch programs or perhaps the online screens that caused this. So we locked the record first before processing to prevent access of other connections to that record while the batch program is accessing it so that we can delete the QUEUE record without any problems (thinking that locking in SQL server is as simple as in Oracle SQL). The problem now is the repeating emails/fax still happens so we had to investigate on why. On the investigation, it seems that delete tries to get an Update (U) lock on other records in the table although only one record is to be deleted.
The implementation above implies that I commit the update on the status, timestamp of the QUEUE so other records can see this change in the status, timestamp. The problem is if other connections have held locks in the QUEUE records during my delete/update (this was the original problem, that's why I have resolved to putting locks), then I have to retry deleting/updating the record because an email/fax has already been sent for it. Rollback of the status to "queued" is out of the question as it will bring then result again to sending of email/fax more than once.
With that, I need to know if there's an option to make "delete" operation not acquire locks on records other than for the record it will delete. Because it seems that the problem is with the "delete" operation taking locks on other records in the table.
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
Thanks for the reply David. I guess I have to attack the problem from a different angle. I'll try to suggest to my superiors the strategy you suggested. =)
ASKER
I appreciate his patience to explain to me the difference of concurrency handling between Oracle SQL and MS SQL Server.
His idea is a workable idea for me, but I think I'm going to modify it a bit to make it more acceptable to my superiors (they don't like modification in the table structures).
His idea is a workable idea for me, but I think I'm going to modify it a bit to make it more acceptable to my superiors (they don't like modification in the table structures).
Check what locks you are actually applying, and what SQL is escalating those locks to.
Cheers
David