cp30
asked on
Thread safe stored procedure
Hi,
I'm working with SQL 2008 R2.
I need a stored that will look through records in a table, identify the next one that should should be processed, mark that record so that no other calls to the same sp can be given the same record. This procedure will be called by a multi-threaded application so it could be called at the same time by multiple threads so it needs to cater for that.
Here's what I came up with and thought it had worked but I've found that it seems possible for 2 calls to the sp at the same time to both be given the same record back....
Can someone please spot what I've done wrong, or suggest a better way to achieve this functionality?
Many thanks in advance
I'm working with SQL 2008 R2.
I need a stored that will look through records in a table, identify the next one that should should be processed, mark that record so that no other calls to the same sp can be given the same record. This procedure will be called by a multi-threaded application so it could be called at the same time by multiple threads so it needs to cater for that.
Here's what I came up with and thought it had worked but I've found that it seems possible for 2 calls to the sp at the same time to both be given the same record back....
DECLARE @threadId int
BEGIN TRANSACTION
-- Get the next available thread
SELECT TOP 1 @threadId = threadId FROM vwThreads
WHERE timeToProcess < GETDATE()
AND thread_status = 'waiting'
ORDER BY last_pulled DESC
-- Update thread to prevent others from taking
UPDATE tblThreads
SET server_name = @serverName, thread_status = 'pulled',last_pulled = GETDATE()
WHERE threadId = @threadId
COMMIT TRANSACTION
-- Now select the full row of data for our thread to send back to application
SELECT TOP 1 * FROM vwThreads
WHERE threadId = @threadId
Can someone please spot what I've done wrong, or suggest a better way to achieve this functionality?
Many thanks in advance
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi ryanmccauley
Thanks for that, that sounds like a good way to go, didn't realise I could do it all in one statement. I will try to implement this into test system early next week and report back on how successful it has been in curing my problem.
Cheers
Thanks for that, that sounds like a good way to go, didn't realise I could do it all in one statement. I will try to implement this into test system early next week and report back on how successful it has been in curing my problem.
Cheers
ASKER
Hi,
Just trying to implement your suggestion and got a little stuck on how to ensure that the 1 record I get is the record with the oldest last_pulled date, was trying similar to the following but get an error on the order by
Thanks
Just trying to implement your suggestion and got a little stuck on how to ensure that the 1 record I get is the record with the oldest last_pulled date, was trying similar to the following but get an error on the order by
UPDATE TOP (1) tblThreads
SET
server_name = @serverName,
thread_status = 'pulled',
last_pulled = GETDATE(),
@threadId = threadId
WHERE
timeToProcess < GETDATE() AND thread_status = 'waiting'
ORDER BY last_pulled DESC
Thanks
That's a bit tricky - you can't use an ORDER BY in an UPDATE statement.
The way we solved this problem was to put a clustered index on time column in our processing queue table, forcing SQL Server to sort the physical data in ascending order on that key. While I always learned not to rely on the order of results in a statement that didn't include an ORDER BY clause, in practice, this effectively returns the oldest row in the table. It's not an official solution, but since order wasn't really critical for us and was really just a suggestion (best-effort instead of true FIFO), this took care of us.
Officially, the answer is to use a sub-query, like this:
That way, your inner query finds the oldest "Last_Pulled" value in the table (where the row is "waiting", whatever that means in your application) and then your outer query only pulls a single row with that value, so you'll get the oldest row in teh table (or one of, if there are multiples with that same datetime).
I hope this answers your question!
The way we solved this problem was to put a clustered index on time column in our processing queue table, forcing SQL Server to sort the physical data in ascending order on that key. While I always learned not to rely on the order of results in a statement that didn't include an ORDER BY clause, in practice, this effectively returns the oldest row in the table. It's not an official solution, but since order wasn't really critical for us and was really just a suggestion (best-effort instead of true FIFO), this took care of us.
Officially, the answer is to use a sub-query, like this:
UPDATE TOP (1) tblThreads
SET
server_name = @serverName,
thread_status = 'pulled',
last_pulled = GETDATE(),
@threadId = threadId
FROM tblThreads t1
JOIN (select min(last_pulled) as last_pulled from tblThreads
where thread_status = 'waiting') t2
ON t1.last_pulled = t2.last_pulled
WHERE
timeToProcess < GETDATE() AND thread_status = 'waiting'
That way, your inner query finds the oldest "Last_Pulled" value in the table (where the row is "waiting", whatever that means in your application) and then your outer query only pulls a single row with that value, so you'll get the oldest row in teh table (or one of, if there are multiples with that same datetime).
I hope this answers your question!
ASKER
Thanks, I think I will try the subquery approach as don;t think we can add a clustered index on that column. I assume this should still be thread safe even with the subquery as it's all executed under on transaction?
Thanks
Thanks
Correct - it's all executed atomically. I've used both approaches and I don't believe we've ever had an issue where the same record is acted on twice in either approach.
ASKER
Great stuff, thanks for the prompt, thorough and accurate advice.
ASKER
I actually managed to find something that I think may help, but some confirmation would be cool.
From what I've read, I should add WITH (UPDLOCK) to my initial select so that no other thread can get the same row until I have finished the transaction, so this should prevent my issue of 2 (or more) threads being able to pull down the same row (threadId)?
Cheers