?
Solved

Stored Procedure that gives me "dirty data?

Posted on 2009-05-11
25
Medium Priority
?
300 Views
Last Modified: 2012-05-06
I have a problem with a stored procedure that gives me "dirty data" - at least that it what I think it must be. I have a small piece of software (a small EXE GUI) running on multiple servers.

What it does is to poll a queue table for new files that should be processed. I have 4 servers and they each poll the same database every 1 second.

The table looks like this:

CREATE TABLE queue_list(
      queue_id INT IDENTITY(1,1) NOT NULL,
      file_name VARCHAR(255) NOT NULL,
      engine_id TINYINT,
      CONSTRAINT [PK_queue_list] PRIMARY KEY CLUSTERED(queue_id)
)


I have a Stored Procedure that is executed at every poll. This will return 1 row, that should be processed on the engine/server that is asking. That row/queue record will be locked at the same time (fetching the queue_id), so I avoid 1 file being processed by multiple servers at the same time. But this is where it goes wrong - because appearently multiple servers ARE some times processing the same file (same queue_id) - but only some times a day this happens, not always.

This is the stored procedure:

CREATE PROCEDURE spGetQueue(
      @EngineID TINYINT
)
AS
SET XACT_ABORT ON
BEGIN TRANSACTION
SET NOCOUNT ON

DECLARE @QueueID INT

/* fetch first record to be processed, oldest first */
SELECT TOP 1 @QueueID = queue_id
FROM queue_list
WHERE engine_id IS NULL
ORDER BY queue_id ASC

/* lock this queue_id, for processing only by this EngineID */
UPDATE queue_list
SET engine_id = EngineID
WHERE queue_id = @QueueID

/* select this queue_id and output for processing by this EngineID */
SELECT file_name
FROM queue_list
WHERE queue_id = @QueueID

COMMIT TRANSACTION
GO

As you can see from above Stored Procedure I use transactions AND "SET XACT_ABORT ON" which I believed locks data until a commit? But I'm quite sure it's here it goes wrong - I just cannot come up with a way to solve it - I'm not very good at the different transaction types. I'm using SQL Server 2000.

Can you please help me. Thank you very much.
0
Comment
Question by:jawsdk
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 13
  • 7
  • 3
  • +1
25 Comments
 
LVL 60

Expert Comment

by:chapmandew
ID: 24354428
you can probably set your isolation level to repeatable read and this will correct your problem.
0
 
LVL 75

Accepted Solution

by:
Aneesh Retnakaran earned 300 total points
ID: 24354619
try putting a TABLOCKX  on the first statement


CREATE PROCEDURE spGetQueue(
      @EngineID TINYINT
)
AS
SET XACT_ABORT ON
SET NOCOUNT ON
DECLARE @QueueID INT

BEGIN TRANSACTION

/* fetch first record to be processed, oldest first */
SELECT TOP 1 @QueueID = queue_id
FROM queue_list WITH (TABLOCKX)
WHERE engine_id IS NULL
ORDER BY queue_id ASC

/* lock this queue_id, for processing only by this EngineID */
UPDATE queue_list
SET engine_id = EngineID
WHERE queue_id = @QueueID

COMMIT TRANSACTION

/* select this queue_id and output for processing by this EngineID */
SELECT file_name
FROM queue_list
WHERE queue_id = @QueueID

GO
0
 
LVL 39

Assisted Solution

by:BrandonGalderisi
BrandonGalderisi earned 150 total points
ID: 24354660
You can also do an update capture.
CREATE PROCEDURE spGetQueue(
      @EngineID TINYINT
)
AS
SET XACT_ABORT ON
SET NOCOUNT ON 
DECLARE @QueueID INT
DECLARE @queues table
 (QueueID int
 ,fname varchar(50)
 ) 
/* lock this queue_id, for processing only by this EngineID */
UPDATE top (1) queue_list
SET engine_id = @EngineID
output inserted.queueid,inserted.filename into @queues(queueid,fname)
WHERE engine_id is null
order by queueid asc 
select @queueid = queueid from @queues 
/* select this queue_id and output for processing by this EngineID */
SELECT fname as file_name
FROM @queues 

GO

Open in new window

0
Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

 
LVL 60

Assisted Solution

by:chapmandew
chapmandew earned 150 total points
ID: 24354691
I wouldn't recommend the TABLOCKX because it will put an exclusive lock on the table...same as using serializable.  

You could use WITH(READPAST) with your queries so that you don't attempt to read records that are locked.
0
 

Author Comment

by:jawsdk
ID: 24354730
Thanks for your fast response. But this is where my SQL knowledge is not enough. I can read about many types of locks. Should I use repeatable read or tablockx ? As I have 4 engines executing this tored procedure every second on the same database I need the solution that "locks less" so I avoid too much waiting time. Imagine having 100 engines and the stored procedure execution time will take 100 ms, what would be the best method to use?

Is tablockx only locking this specific queue_id? so if another engine is also executing at the same time, it will not wait for the other stored procedure to end, but simply just select the queue_id right after, because the previous one was locked?

I look forward hearing your comments.
0
 

Author Comment

by:jawsdk
ID: 24354794
BrandonGalderisi and chapmandew. Sorry I didn't read your posts.

I get your idea BrandonGalderisi, but it seems like performance overhead to create table variable and doing updates and inserts, for such stored procedure that is execute every second by 4 engines? I'm always thinking ahead - what if there was 100 engines and they executed this stored procedure every second? Would this sulution still be ok?

chapmandew, can you please send exact code, so I can figure out precisely how you would do this. Thank you very much.
0
 
LVL 39

Expert Comment

by:BrandonGalderisi
ID: 24354849
The table variable is the fastest, and easiest way, to capture the single record that you are updating.  It will be no more of a performance hit than a select/update/select.
0
 
LVL 75

Expert Comment

by:Aneesh Retnakaran
ID: 24355186
>Is tablockx only locking this specific queue_id?
No, that will put an exclusive lock on the table till the end of the transaction, that means no other operations can be performed till the end of that transaction
0
 

Author Comment

by:jawsdk
ID: 24355762
Hi guys
As usual I should just pass on the real stored procedure instead of simplifiying things. I noticed that I actually left out a parameter in the stored procedure that you probably would like to know of. I have the option (1) ONLY to check if there is new files to process or (2) actually check AND lock a queue_id.

I'm a bit in doubt about how I would re-arrange my code in this case?


CREATE PROCEDURE spGetQueue(
	@EngineID TINYINT,
	@OnlyCheckForChanges TINYINT
)
AS
SET XACT_ABORT ON
BEGIN TRANSACTION
SET NOCOUNT ON
 
DECLARE @QueueID INT
 
/* fetch first record to be processed, oldest first */
SELECT TOP 1 @QueueID = queue_id
FROM queue_list
WHERE engine_id IS NULL
ORDER BY queue_id ASC
 
IF @OnlyCheckForChanges = 0 --lock this queue_id for processing
BEGIN
	/* lock this queue_id, for processing only by this EngineID */
	UPDATE queue_list
	SET engine_id = EngineID
	WHERE queue_id = @QueueID
 
	/* select this queue_id and output for processing by this EngineID */
	SELECT file_name
	FROM queue_list
	WHERE queue_id = @QueueID
END
ELSE --just tell that there is queue_id available for processing, but do not lock this yet
BEGIN
	SELECT file_name
	FROM queue_list
	WHERE queue_id = @QueueID
END
 
COMMIT TRANSACTION
GO

Open in new window

0
 
LVL 60

Expert Comment

by:chapmandew
ID: 24356052
From looking at your proc code, the code submitted by Brandon is the way to go for sure.  By updating a single record and capturing it, you don't have to worry about 2 different SQL statements.
0
 

Author Comment

by:jawsdk
ID: 24362247
Brandon: I can see you are using "UPDATE top (1)" and I seem to read this is a SQL 2005 only feature. I'm using SQL Server 2000 (but somehow EE though it should be removed from their groups). Is there any workaround for SQL Server 2000 ?
0
 

Author Comment

by:jawsdk
ID: 24362412
I also noted right now, that "output inserted" is just as well an SQL Server 2005 function. It seems like I will need to think in another approach. What would your recommendation be for SQL Server 2000 ?

I found that SET ROWCOUNT 1 can be used instead of TOP(1), but now I don't know how to handle "output inserted" for SQL Server 2000 ?
0
 

Author Comment

by:jawsdk
ID: 24362761
Couldn't I use use the following select?

SELECT TOP 1 @QueueID = queue_id
FROM queue_list WITH(ROWLOCK,READPAST)
WHERE engine_id IS NULL
ORDER BY queue_id ASC

I n my head this would lock this 1 row, while reading (and until the transaction is committed). Readpast will allow concurrency when executing this stored procedure many times withnin the same second, because it will not wait for the ROWLOCK to be released, but simply just read the next record instead. Correct or am I missing something?
0
 

Author Comment

by:jawsdk
ID: 24374849
Hi guys
I posted a few ideas yesterday. I would appreciate help and this is quite urgent - I have therefore increased point to 200. I hope you can help me. Thank you very much.

The funny thing is that I don't understand why I'm able to get same QueueID anyway, on 2 servers? I tried the following using my original code:

-- number #1
begin transaction
execute spGetQueue 1
[No commit transaction because I would like to test the scenario where this stored procedure is execute at the exact same time. So I keep the connection open and make a cmiit AFTER I have executed #2 below.]

Then I opened up another QueryAnalyzer window and did the following:
-- number #2
execute spGetQueue 2

But I can see that currently #2 waits just fint until #1 had a commit, so in my test it should impossible actually to get the same QueueID. Then how can it be I'm actually getting same queueOID today, with my 4 servers executing this stored procedure every second?
0
 
LVL 39

Expert Comment

by:BrandonGalderisi
ID: 24375343
Well neither of those calls are valid because you aren't specifying a value for @OnlyCheckForChanges which has no default.

If @OnlyCheckForChanges is anything other than 0, then you will always get the same QueueID call after call.
0
 

Author Comment

by:jawsdk
ID: 24375871
Sorry I gave you an example with my original code. I did this with the updated code that has the @OnlyCheckForChanges = 0, which means it performs an update. I think the update makes it lock, so I cannot complete the #2 execution until #1 has been committed, correct?
0
 

Author Comment

by:jawsdk
ID: 24429322
Hi Guys
I never seemed to get an answer to how the "UPDATE top (1)" works for SQL Server 2000 as I'm using? Any ideas on a workaround on this?

Thank you very much.
0
 
LVL 39

Expert Comment

by:BrandonGalderisi
ID: 24431606
I'm pretty sure you can't "update top 1" in 2000.  You have to either set rowcount 1 before the update or use a sub-select with a top 1 to limit the results.


The sub-select would look like:
UPDATE top (1) queue_list
SET engine_id = @EngineID
WHERE QueueId = (select top 1 QueueId from Queue_List where engine_id is null)
order by queueid asc 

Open in new window

0
 

Author Comment

by:jawsdk
ID: 24432312
Thanks Brandon. But this doesn't give me the affected QueueID which I need in a SELECT right after. Any idea how to get the affected QueueID of the UPDATE using either "rowcount" or sub-select method in 2000 ?
0
 
LVL 39

Expert Comment

by:BrandonGalderisi
ID: 24432709
It should be sufficient to put the begin tran, select, update, commit in.  Did you try aneesh's code?
0
 

Author Comment

by:jawsdk
ID: 24432820
Ok, I will go for that way then. No I didn't want to use aneesh code because it locks the complete table. I tried "WITH" keyword and ROWLOCK, but coultn't get it to work - it gave me an error message about a special transaction isolation level which I needed, but couldn't also not get this to work.

I would love to use ROWLOCK, because I think that will solve my problem. Do you have any experience with this?
0
 
LVL 39

Expert Comment

by:BrandonGalderisi
ID: 24433050
You are going to lock, update commit.  So assuming your table is properly indexed, that won't take more than a few ms.
0
 

Author Comment

by:jawsdk
ID: 24433156
But why can't I use rowlock? This would be the ideal situation in a queue table. In a queue it's important my engines gets a record in the queue - but it does not matter which one, just the first one available. So in my head using WITH(ROWLOCK,READPAST)" would be the ideal trick. Have you tried this before?

When ROWLOCK exist, it would be creazy to lock the complete table using TABLOCKX, right?
0
 
LVL 39

Expert Comment

by:BrandonGalderisi
ID: 24433352
"error message about a special transaction isolation level which I needed"

Perhaps provide the error message.  
0
 

Author Closing Comment

by:jawsdk
ID: 31580139
You guys leaded me into the area of what to do. I got no perfect solution to my problem, so I have split the points.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

by Mark Wills PIVOT is a great facility and solves many an EAV (Entity - Attribute - Value) type transformation where we need the information held as data within a column to become columns in their own right. Now, in some cases that is relatively…
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question