Improve performance with newid()

I have a SQL table which contains 20 million+ rows.  I need to randomly pull records from this table to update them.
I have the following query that I run for this:

	SELECT top 1 @userId = userId
	from dbo.users
	where location is null
	order by newid()

Open in new window


This query is in a SQL procedure that is called from c#.  When I run a signal thread, this query takes less than half a second to complete.  However, when I have 10 threads going, it takes 6 seconds for each thread.

When I remove “ORDER BY NEWID()” the duration indicates 0.  However, I end up with all 10 threads working on the same user.

I got these numbers by using SQL profiler.

Each thread gets its own SQL Connection.  I leave the connection open for each thread so a new one is not created each time.

Does anyone have any suggestions on a faster method to do this?
rye004Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

dsackerContract ERP Admin/ConsultantCommented:
Is there any relationship at all between what user you SELECT and the thread you're in? In other words, is this a completely random exercise for no real business reason at all? Or are you using multiple threads to gradually update all of the records?

I ask in hopes there might be some rule you could use that connects the dots between your thread and what user(s) you might restrict yourself to.

Otherwise, if you truly want to have 10 threads (or even more), you could create a not-so-temporary tracking table (perhaps call it tt_users), where each thread would INSERT the userid they just pulled, then after they updated the dbo.users table, delete the tt_users record for that userid.

That way, all threads could run this:
SELECT top 1 @userId = userId
	from dbo.users
	where location is null
        and userid NOT IN (SELECT userid FROM tt_users)

Open in new window

Thus, you can lose the NEWID(). This is somewhat of a classic approach to OLTP updates of a nature, perhaps not entirely like this, but where multiple threads want to safeguard against each other.

The tt_users might need to contain a datetime field, and if you have any unique connection information per thread, its value as well, simply for audit and potential cleanup, in case of a disconnect (which happens).
0
Mark ElySenior Coldfusion DeveloperCommented:
Alter your procedure and Add
WITH RECOMPILE
before the query
0
Duy PhamFreelance IT ConsultantCommented:
I would suggest you add an identity column to your [users] table (if it hasn't had one), create an index for that column. Then to randomly get a record, generate a random number and select an user with identity value closed to the random number.

Something like:
DECLARE @maxIdentity INT = (SELECT MAX(IdentityColumn) FROM [dbo].[users]) + 1
DECLARE @randomNumber = ABS(CHECKSUM(NEWID())) % @maxIdentity
SELECT TOP 1 @userId = UserId FROM [dbo].[users] WHERE IdentityColumn <= @randomNumber

Open in new window

0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Vitor MontalvãoMSSQL Senior EngineerCommented:
You're only showing us a small part of your code. What do you want to achieve? Maybe having all the picture you can provide you a better solution.
0
rye004Author Commented:
Forgive me for not responding back sooner. What I ended up doing is have each thread collect more than one record at a time. There is some overlapping, but it is  minimal. This also reduces MySQL calls.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
rye004Author Commented:
after trying everyone's suggestion, what I came up with worked the best.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.