Link to home
Start Free TrialLog in
Avatar of Mr_Peerapol
Mr_Peerapol

asked on

Is "REPEATABLE READ" / "SERIALIZABLE" required for INSERT ... SELECT ... WHERE NOT EXISTS (...) ?

Hi all,

I have multiple threads at the very same time trying to insert the same record to one table (say TableA) like this:

INSERT INTO TableA (Col1)
SELECT TOP 1 B.Col1 FROM TableB B
WHERE NOT EXISTS (SELECT * FROM TableA A WHERE A.Col1 = B.Col1)

Currently, I use the default isolation level (READ COMMITTED) and found that more than 1 threads are able to insert the same record from TableB into TableA, why?

Do I need to use REPEATABLE READ or SERIALIZABLE instead?

Thank you very much,
Mr_Peerapol
Avatar of chapmandew
chapmandew
Flag of United States of America image

>>and found that more than 1 threads are able to insert the same record from TableB into TableA,

an isolation level isn't going to stop this....it is just going to make the other processes wait before they can insert their data.  So, for your instance, why do you have different threads inserting the same data?
Avatar of Mr_Peerapol
Mr_Peerapol

ASKER

TableB contains available tickets.
TableA contains used tickets (the actual requirement is more complex than this).

Each thread tries to get one available ticket from TableB (by using NOT EXISTS). However during high load, I found that more than 1 thread get the same ticket.

I thought INSERT ... SELECT would be treated as one atomic statement and should not cause the problem I am facing now. Probably, I missed some basic database concept here.

Thank you,
ASKER CERTIFIED SOLUTION
Avatar of chapmandew
chapmandew
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you very much. I will try it tomorrow when I am at my office (I guess SERIALIZABLE will do the trick).

Would you mind to explain why READ COMMITTED isn't enough for this case and why REPEATABLE READ isn't too?
Thank you very much for a good article.

Do you have a better idea how to do this without SERIALIZABLE? My only requirement is that I cannot add any new column to TableB.

If you mind, I can open a new question.
Do you have any type of column in tableB that you can update when the ticket is reserved?  So, if there is a record there and someone reserves it, mark it so that no other thread can pick that specific record.  Then, you can just take that record and insert into tableA.

Also, using 2000 or 2005?
I am using SQL Server 2000. Actually TableB has a column "CONFIRM_USER" to indicate who "confirms" the ticket. The workflow is that after a thread get an available ticket, a user will confirm it (my program will update CONFIRM_USER column.

I don't want to use this column when I "reserve" a ticket (update it to something like 'reserved by XXX' for example) because other programs also use this column and they expects it to have a value only when the ticket is confirmed.

If this is just the only option, I will need to trade off between performance and big changes.

Thank you very much again.
Its not the only option, but it is an option.

Another would be to put a unique constraint on the col1 table in tableA...at least then there would be no duplicates.
I would think a lock hint on table b during the process would be enough, thus serializing only that table, not all of them in the transaction:


INSERT INTO TableA (Col1)
SELECT TOP 1 B.Col1 FROM TableB B WITH (UPDLOCK, HOLDLOCK)
WHERE NOT EXISTS (SELECT * FROM TableA A WHERE A.Col1 = B.Col1)
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Yep.  The serializable isolation level is the way to go...but, you could get around it if you just added a unique constraint to your tableA field.
I could not add Unique Constraint on TableA (TICKET_ID, LOCK_STATUS) because multiple records can have the same TICKET_ID and LOCK_STATUS('N') when user rejected the ticket several times. I need some magic constraint for thencolumn TICKET_ID only when LOCK_STATUS is 'Y' ?

SERIALIZABLE is working  for me now, but it causes many DEADLOCKs during high load (more than 100+ concurrent users).
Which is why you should limit seriablizable to as few tables as necessary.