Solved

Is "REPEATABLE READ" / "SERIALIZABLE" required for INSERT ... SELECT ... WHERE NOT EXISTS (...) ?

Posted on 2008-06-24
14
803 Views
Last Modified: 2011-09-20
Hi all,

I have multiple threads at the very same time trying to insert the same record to one table (say TableA) like this:

INSERT INTO TableA (Col1)
SELECT TOP 1 B.Col1 FROM TableB B
WHERE NOT EXISTS (SELECT * FROM TableA A WHERE A.Col1 = B.Col1)

Currently, I use the default isolation level (READ COMMITTED) and found that more than 1 threads are able to insert the same record from TableB into TableA, why?

Do I need to use REPEATABLE READ or SERIALIZABLE instead?

Thank you very much,
Mr_Peerapol
0
Comment
Question by:Mr_Peerapol
  • 6
  • 5
  • 3
14 Comments
 
LVL 60

Expert Comment

by:chapmandew
ID: 21858533
>>and found that more than 1 threads are able to insert the same record from TableB into TableA,

an isolation level isn't going to stop this....it is just going to make the other processes wait before they can insert their data.  So, for your instance, why do you have different threads inserting the same data?
0
 
LVL 25

Author Comment

by:Mr_Peerapol
ID: 21858741
TableB contains available tickets.
TableA contains used tickets (the actual requirement is more complex than this).

Each thread tries to get one available ticket from TableB (by using NOT EXISTS). However during high load, I found that more than 1 thread get the same ticket.

I thought INSERT ... SELECT would be treated as one atomic statement and should not cause the problem I am facing now. Probably, I missed some basic database concept here.

Thank you,
0
 
LVL 60

Accepted Solution

by:
chapmandew earned 400 total points
ID: 21858779
OK, I understand now.  You're reading the data then doing something with it.  Your best bet is to use serializable for the transaction.  It may slow things down a bit as it is the most restrictive in terms of concurrency, but it should prevent these types of problems.  Give this a shot:

set transaction isolation level serializable

INSERT INTO TableA (Col1)
SELECT TOP 1 B.Col1 FROM TableB B
WHERE NOT EXISTS (SELECT * FROM TableA A WHERE A.Col1 = B.Col1)



0
 
LVL 25

Author Comment

by:Mr_Peerapol
ID: 21858856
Thank you very much. I will try it tomorrow when I am at my office (I guess SERIALIZABLE will do the trick).

Would you mind to explain why READ COMMITTED isn't enough for this case and why REPEATABLE READ isn't too?
0
 
LVL 60

Expert Comment

by:chapmandew
ID: 21858870
0
 
LVL 25

Author Comment

by:Mr_Peerapol
ID: 21859019
Thank you very much for a good article.

Do you have a better idea how to do this without SERIALIZABLE? My only requirement is that I cannot add any new column to TableB.

If you mind, I can open a new question.
0
 
LVL 60

Expert Comment

by:chapmandew
ID: 21859061
Do you have any type of column in tableB that you can update when the ticket is reserved?  So, if there is a record there and someone reserves it, mark it so that no other thread can pick that specific record.  Then, you can just take that record and insert into tableA.

Also, using 2000 or 2005?
0
Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 
LVL 25

Author Comment

by:Mr_Peerapol
ID: 21859164
I am using SQL Server 2000. Actually TableB has a column "CONFIRM_USER" to indicate who "confirms" the ticket. The workflow is that after a thread get an available ticket, a user will confirm it (my program will update CONFIRM_USER column.

I don't want to use this column when I "reserve" a ticket (update it to something like 'reserved by XXX' for example) because other programs also use this column and they expects it to have a value only when the ticket is confirmed.

If this is just the only option, I will need to trade off between performance and big changes.

Thank you very much again.
0
 
LVL 60

Expert Comment

by:chapmandew
ID: 21859277
Its not the only option, but it is an option.

Another would be to put a unique constraint on the col1 table in tableA...at least then there would be no duplicates.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 21859417
I would think a lock hint on table b during the process would be enough, thus serializing only that table, not all of them in the transaction:


INSERT INTO TableA (Col1)
SELECT TOP 1 B.Col1 FROM TableB B WITH (UPDLOCK, HOLDLOCK)
WHERE NOT EXISTS (SELECT * FROM TableA A WHERE A.Col1 = B.Col1)
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 100 total points
ID: 21859432
>>  I use the default isolation level (READ COMMITTED) and found that more than 1 threads are able to insert the same record from TableB into TableA, why? <<

The dual reads are coming from tableB.  All the data on B for the relevant row could be fully committed, and thus available to be read, and it could still be read by multiple transactions simultaneously trying to insert into tableA.  That is, since the data in tableB is already committed, reading it mutliple times does not violate "READ COMMITTED" locking, since multiple reads of the same row can be successfully shared between different tasks.
0
 
LVL 60

Expert Comment

by:chapmandew
ID: 21859455
Yep.  The serializable isolation level is the way to go...but, you could get around it if you just added a unique constraint to your tableA field.
0
 
LVL 25

Author Comment

by:Mr_Peerapol
ID: 21862475
I could not add Unique Constraint on TableA (TICKET_ID, LOCK_STATUS) because multiple records can have the same TICKET_ID and LOCK_STATUS('N') when user rejected the ticket several times. I need some magic constraint for thencolumn TICKET_ID only when LOCK_STATUS is 'Y' ?

SERIALIZABLE is working  for me now, but it causes many DEADLOCKs during high load (more than 100+ concurrent users).
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 21865846
Which is why you should limit seriablizable to as few tables as necessary.
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Join & Write a Comment

Suggested Solutions

In this article—a derivative of my DaytaBase.org blog post (http://daytabase.org/2011/06/18/what-week-is-it/)—I will explore a few different perspectives on which week today's date falls within using Microsoft SQL Server. First, to frame this stu…
Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now