Solved

SQL Server atomic operations - locking table

Posted on 2015-02-05
12
304 Views
Last Modified: 2015-02-20
Hi Experts,

I have a table called tasks that I'd like to grab about 5 tasks from (from N machines).  I want to basically do the following:

- select Y rows, where there's a column in there that gives me the job size as a number, and I want to stop when Y grows greater than say 100
- mark those rows as job started

This query / update is happening from multiple machines, so I need to lock the table and/or do a transaction.  What's the right approach?

Thanks a lot!
Mike
0
Comment
Question by:thready
  • 6
  • 3
  • 2
  • +1
12 Comments
 
LVL 24

Expert Comment

by:Phillip Burton
ID: 40591156
You could use WITH (TABLOCK) - that will lock the table until the end of the statement.

However, I don't know that you would need that.

If you have intermediate stages - say computer X needed to keep hold of these rows and do something else, then you can mark them as "Used by Computer X". That way, when computer Y comes, if you have a WHERE Status IS NULL, then it wouldn't use those rows.
0
 
LVL 1

Author Comment

by:thready
ID: 40591201
I just need to set one column to the machine handling that row and mark it to started.  I guess I don't need to worry about making it to started if I know there's a machine already handling it...

Thank you I'll read about tab lock...
0
 
LVL 24

Accepted Solution

by:
Phillip Burton earned 200 total points
ID: 40591236
If you just need to do that, then you don't need to worry about locking the table - it's overkill, but you can do it if you are concerned about conflicts.

If you update a row, then you get sufficient row locks and page locks automatically.
0
 
LVL 26

Expert Comment

by:Zberteoc
ID: 40591342
We have a similar table, called task, into we insert rows that will be dealt with from different application and by multiple users. Normally a task is available if a certain column, called status, is null and as soon as task are required by some operator status is changed to locked(could be started), a column called account will have the code for who locked(started) teh task along to a startetime value in a different column. After the task is finished the status is changed to finished and a end time column is updated. So we have info about the task status, account id, start time and end time. You cannot get a task that has status column that is not null. Definitely there is no need to lock the table as SQL engine does a pretty good job taking care of that already.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 40591661
In order to do this without deadlocking, you'll probably need to use the WITH (READPAST) hint when you get rows to be assigned to the current machine.  Any rows that were updated earlier as assigned to another task will simply be skipped over and other, available rows selected.
0
 
LVL 1

Author Comment

by:thready
ID: 40597113
Hi everyone, thanks for your responses.  I haven't had a chance to try these yet.  As soon as I start work on this I will return!

Thanks for your patience.
Mike
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 
LVL 1

Author Comment

by:thready
ID: 40621280
I'm not sure I entirely understand what I need to do here.

Zberteoc, from your answer, it seems I need to execute 2 statements?  One to update the row so that "I" can reserve that task for me- then query rows that have been freshly reserved to me?  And then update them yet again when I've started them to "started" so that I know not to start the same task over and over?

Scott- where does the deadlocking come in?

Is there an atomic select/update I can use?

Thanks a lot,
Mike
0
 
LVL 1

Author Comment

by:thready
ID: 40621308
Scott - I think I just clicked - you're referring to Zberteoc's answer- I try locking the rows with a statement like:

UPDATE task set owner='mike', status='reserved' where owner='null' limit 10

then
SELECT * from task where owner='mike' AND status='reserved'

then for each task:
UPDATE task set status='started' where taskid=1  (or whatever)

Doesn't seem like the best way.  Seems brute force.. But does this even work in all cases?

Thanks again,
Mike
0
 
LVL 26

Expert Comment

by:Zberteoc
ID: 40621354
This is the process how I envision it. A task in table will have few important columns:

TaskId, unique per task, identifies the task, PK.
TaskType, if you have different tasks types/groups
ReferenceId, this will be a reference to some source for which the task was created for. A task by itself cannot be if is not related to an activity, resource, entity, etc. Usually point to a different table PK
TaskStatus. this will mark the different phases or steps the task is flowing through
CreatedDate, moment when a task is created
StartTime. moment when the task is assigned to somebody/operator(also status will change)
EndTime, moment the status is finished by operator(status will change to "Finished" or some code
Resolution, this will either describe the result of the task or will point to some code of possible resolutions
OperatorId, this will be the id of the person that the task was assigned to

The process flow would be something like this:

1. Create a task into a task table. At this point the TaskStatus will be set to, let's say, "Created" or 1(which means unlocked), if you use codes, and the CreatedDate column will be updated with that moment in time. Also the ResourceId will be set up. This task is not assigned yet to anybody so it has no StartTime and no EndTime, they will be NULL, as well as with the OperatorId. A task can stay idle until is assigned to somebody

2. Assign the task to somebody. At this point you select the a task or a number of tasks from the table that have a status of Created or 1 only (!!!) and you update the OperatorId, StartTime and TaskStatus to Started, or 2(which means locked).

2. The operator works on the task and when he/she finishes it you will update the columns EndTime, TaskStatus to Completed or 3 and the Resolution.


Done.

So you will touch the task 3 times: at creation(INSERT),  when it is assigned(UPDATE) and when is finished(UPDATE). A task is unlocked as long as it has a status of Created or 1 and locked for any other value.
0
 
LVL 1

Author Comment

by:thready
ID: 40621375
I realize now that I was unclear about something - this all makes sense - my concern is with #2 - Assign the task to somebody.  We have potentially 100 machines trying to do this step (there is no centralized control of this step).  That's why I'm worried about the UPDATE being atomic.  And I guess if I do it this way I will touch the task 4 times (creation, reservation, started, finished) - 1 insert, 3 updates...

So the first update to reserve - If I do an UPDATE on this- can it happen that other machines also set this value?

i.e.,
Machine 1 --> UPDATE tasks set status='reserved' where status='queued' limit 10

Machine 2 --> does the same...

It would be better if machine 1 got to update its rows without deadlocking or colliding with machine 2's updates...  I'm really wondering if there are any issues with this seeing as though there is no centralized control.

Thanks again,
Mike
0
 
LVL 26

Assisted Solution

by:Zberteoc
Zberteoc earned 300 total points
ID: 40621401
It will never happen at the same time because SQL server LOCKS the rows on UPDATE oration, and by that mean internally the SQL engine effectively puts a lock on the row based on a transaction id which will be always different form process/machine to process/machine.

So if process/machine 1 gets a hold on the update the process/machine 2 will put to wait until machine 1 finishes. After that the task row will have a status of Started, which means is already assigned(locked).

If you want to be the one that controls it then you will have to create a loose end process where every process/machine asks for tasks by inserting a request in a special table and then you will have an automated process that will deal sequentially every request. However this sounds  more like an overkill which is not necessary.
0
 
LVL 1

Author Closing Comment

by:thready
ID: 40621411
Thanks everyone for your help!  Much appreciated.
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Suggested Solutions

Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
Creating and Managing Databases with phpMyAdmin in cPanel.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now