We help IT Professionals succeed at work.

Stored Procedure that gives me "dirty data?

jawsdk
jawsdk asked
on
335 Views
Last Modified: 2012-05-06
I have a problem with a stored procedure that gives me "dirty data" - at least that it what I think it must be. I have a small piece of software (a small EXE GUI) running on multiple servers.

What it does is to poll a queue table for new files that should be processed. I have 4 servers and they each poll the same database every 1 second.

The table looks like this:

CREATE TABLE queue_list(
      queue_id INT IDENTITY(1,1) NOT NULL,
      file_name VARCHAR(255) NOT NULL,
      engine_id TINYINT,
      CONSTRAINT [PK_queue_list] PRIMARY KEY CLUSTERED(queue_id)
)


I have a Stored Procedure that is executed at every poll. This will return 1 row, that should be processed on the engine/server that is asking. That row/queue record will be locked at the same time (fetching the queue_id), so I avoid 1 file being processed by multiple servers at the same time. But this is where it goes wrong - because appearently multiple servers ARE some times processing the same file (same queue_id) - but only some times a day this happens, not always.

This is the stored procedure:

CREATE PROCEDURE spGetQueue(
      @EngineID TINYINT
)
AS
SET XACT_ABORT ON
BEGIN TRANSACTION
SET NOCOUNT ON

DECLARE @QueueID INT

/* fetch first record to be processed, oldest first */
SELECT TOP 1 @QueueID = queue_id
FROM queue_list
WHERE engine_id IS NULL
ORDER BY queue_id ASC

/* lock this queue_id, for processing only by this EngineID */
UPDATE queue_list
SET engine_id = EngineID
WHERE queue_id = @QueueID

/* select this queue_id and output for processing by this EngineID */
SELECT file_name
FROM queue_list
WHERE queue_id = @QueueID

COMMIT TRANSACTION
GO

As you can see from above Stored Procedure I use transactions AND "SET XACT_ABORT ON" which I believed locks data until a commit? But I'm quite sure it's here it goes wrong - I just cannot come up with a way to solve it - I'm not very good at the different transaction types. I'm using SQL Server 2000.

Can you please help me. Thank you very much.
Comment
Watch Question

CERTIFIED EXPERT
Awarded 2008
Awarded 2008

Commented:
you can probably set your isolation level to repeatable read and this will correct your problem.
Database Consultant
CERTIFIED EXPERT
Top Expert 2009
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
CERTIFIED EXPERT
Awarded 2008
Awarded 2008
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
Thanks for your fast response. But this is where my SQL knowledge is not enough. I can read about many types of locks. Should I use repeatable read or tablockx ? As I have 4 engines executing this tored procedure every second on the same database I need the solution that "locks less" so I avoid too much waiting time. Imagine having 100 engines and the stored procedure execution time will take 100 ms, what would be the best method to use?

Is tablockx only locking this specific queue_id? so if another engine is also executing at the same time, it will not wait for the other stored procedure to end, but simply just select the queue_id right after, because the previous one was locked?

I look forward hearing your comments.

Author

Commented:
BrandonGalderisi and chapmandew. Sorry I didn't read your posts.

I get your idea BrandonGalderisi, but it seems like performance overhead to create table variable and doing updates and inserts, for such stored procedure that is execute every second by 4 engines? I'm always thinking ahead - what if there was 100 engines and they executed this stored procedure every second? Would this sulution still be ok?

chapmandew, can you please send exact code, so I can figure out precisely how you would do this. Thank you very much.
The table variable is the fastest, and easiest way, to capture the single record that you are updating.  It will be no more of a performance hit than a select/update/select.
AneeshDatabase Consultant
CERTIFIED EXPERT
Top Expert 2009

Commented:
>Is tablockx only locking this specific queue_id?
No, that will put an exclusive lock on the table till the end of the transaction, that means no other operations can be performed till the end of that transaction

Author

Commented:
Hi guys
As usual I should just pass on the real stored procedure instead of simplifiying things. I noticed that I actually left out a parameter in the stored procedure that you probably would like to know of. I have the option (1) ONLY to check if there is new files to process or (2) actually check AND lock a queue_id.

I'm a bit in doubt about how I would re-arrange my code in this case?


CREATE PROCEDURE spGetQueue(
	@EngineID TINYINT,
	@OnlyCheckForChanges TINYINT
)
AS
SET XACT_ABORT ON
BEGIN TRANSACTION
SET NOCOUNT ON
 
DECLARE @QueueID INT
 
/* fetch first record to be processed, oldest first */
SELECT TOP 1 @QueueID = queue_id
FROM queue_list
WHERE engine_id IS NULL
ORDER BY queue_id ASC
 
IF @OnlyCheckForChanges = 0 --lock this queue_id for processing
BEGIN
	/* lock this queue_id, for processing only by this EngineID */
	UPDATE queue_list
	SET engine_id = EngineID
	WHERE queue_id = @QueueID
 
	/* select this queue_id and output for processing by this EngineID */
	SELECT file_name
	FROM queue_list
	WHERE queue_id = @QueueID
END
ELSE --just tell that there is queue_id available for processing, but do not lock this yet
BEGIN
	SELECT file_name
	FROM queue_list
	WHERE queue_id = @QueueID
END
 
COMMIT TRANSACTION
GO

Open in new window

CERTIFIED EXPERT
Awarded 2008
Awarded 2008

Commented:
From looking at your proc code, the code submitted by Brandon is the way to go for sure.  By updating a single record and capturing it, you don't have to worry about 2 different SQL statements.

Author

Commented:
Brandon: I can see you are using "UPDATE top (1)" and I seem to read this is a SQL 2005 only feature. I'm using SQL Server 2000 (but somehow EE though it should be removed from their groups). Is there any workaround for SQL Server 2000 ?

Author

Commented:
I also noted right now, that "output inserted" is just as well an SQL Server 2005 function. It seems like I will need to think in another approach. What would your recommendation be for SQL Server 2000 ?

I found that SET ROWCOUNT 1 can be used instead of TOP(1), but now I don't know how to handle "output inserted" for SQL Server 2000 ?

Author

Commented:
Couldn't I use use the following select?

SELECT TOP 1 @QueueID = queue_id
FROM queue_list WITH(ROWLOCK,READPAST)
WHERE engine_id IS NULL
ORDER BY queue_id ASC

I n my head this would lock this 1 row, while reading (and until the transaction is committed). Readpast will allow concurrency when executing this stored procedure many times withnin the same second, because it will not wait for the ROWLOCK to be released, but simply just read the next record instead. Correct or am I missing something?

Author

Commented:
Hi guys
I posted a few ideas yesterday. I would appreciate help and this is quite urgent - I have therefore increased point to 200. I hope you can help me. Thank you very much.

The funny thing is that I don't understand why I'm able to get same QueueID anyway, on 2 servers? I tried the following using my original code:

-- number #1
begin transaction
execute spGetQueue 1
[No commit transaction because I would like to test the scenario where this stored procedure is execute at the exact same time. So I keep the connection open and make a cmiit AFTER I have executed #2 below.]

Then I opened up another QueryAnalyzer window and did the following:
-- number #2
execute spGetQueue 2

But I can see that currently #2 waits just fint until #1 had a commit, so in my test it should impossible actually to get the same QueueID. Then how can it be I'm actually getting same queueOID today, with my 4 servers executing this stored procedure every second?
Well neither of those calls are valid because you aren't specifying a value for @OnlyCheckForChanges which has no default.

If @OnlyCheckForChanges is anything other than 0, then you will always get the same QueueID call after call.

Author

Commented:
Sorry I gave you an example with my original code. I did this with the updated code that has the @OnlyCheckForChanges = 0, which means it performs an update. I think the update makes it lock, so I cannot complete the #2 execution until #1 has been committed, correct?

Author

Commented:
Hi Guys
I never seemed to get an answer to how the "UPDATE top (1)" works for SQL Server 2000 as I'm using? Any ideas on a workaround on this?

Thank you very much.
I'm pretty sure you can't "update top 1" in 2000.  You have to either set rowcount 1 before the update or use a sub-select with a top 1 to limit the results.


The sub-select would look like:
UPDATE top (1) queue_list
SET engine_id = @EngineID
WHERE QueueId = (select top 1 QueueId from Queue_List where engine_id is null)
order by queueid asc 

Open in new window

Author

Commented:
Thanks Brandon. But this doesn't give me the affected QueueID which I need in a SELECT right after. Any idea how to get the affected QueueID of the UPDATE using either "rowcount" or sub-select method in 2000 ?
It should be sufficient to put the begin tran, select, update, commit in.  Did you try aneesh's code?

Author

Commented:
Ok, I will go for that way then. No I didn't want to use aneesh code because it locks the complete table. I tried "WITH" keyword and ROWLOCK, but coultn't get it to work - it gave me an error message about a special transaction isolation level which I needed, but couldn't also not get this to work.

I would love to use ROWLOCK, because I think that will solve my problem. Do you have any experience with this?
You are going to lock, update commit.  So assuming your table is properly indexed, that won't take more than a few ms.

Author

Commented:
But why can't I use rowlock? This would be the ideal situation in a queue table. In a queue it's important my engines gets a record in the queue - but it does not matter which one, just the first one available. So in my head using WITH(ROWLOCK,READPAST)" would be the ideal trick. Have you tried this before?

When ROWLOCK exist, it would be creazy to lock the complete table using TABLOCKX, right?
"error message about a special transaction isolation level which I needed"

Perhaps provide the error message.  

Author

Commented:
You guys leaded me into the area of what to do. I got no perfect solution to my problem, so I have split the points.

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.