Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium


Windows Sleep Microseconds for Low Latency Thread Communication

Posted on 2012-04-05
Medium Priority
Last Modified: 2013-08-07
Is there any way I could suspend the thread for 1-2 microseconds? (Note 1 microsecond = 1/1000 miliseond). I know this question has been asked but so far there is no good answer for this. Here is my situation why I need this:

I have a server with several threads running to serve client's requests. Requests are sent by other thread through shared memory which occurs occasionally. However, once request is sent it needs to be processed ASAP. So, I need a fast way to notify one of the serving threads when request is queued in shared memory.

Windows event object was used but it takes 6-7us from SetEvent() to WaitForSingleObject() return on my machine (tried to set process/thread priority but still not much improvement). I tried to use a busy loop to let the serving threads keep pooling the memory which lower the latency to 1-2us, which is good enough, but it burns the CPU while the requests are only sent like once per minutes. If I could insert a micro/nano second sleep into the loop I could at least get my CPU free while keep the latency low.

I would be glad if anyone could suggest me another way to do the thread communication with latency lower than 2us. Thanks
Question by:codeblue229
  • 2
  • 2
LVL 16

Expert Comment

ID: 37815902
Have you tried Sleep(0) ?

It doesn't sleep for any given set of time.  But what it is supposed to accomplish is to give up the remainder of its time slice to allow other processes access to the CPU.

Author Comment

ID: 37816351
Similiar to windows API "SwitchToThread()", it yields execution to other thread but the CPU would still at peak, because the thread is not actually in sleep mode the CPU will resume execution of the thread and never go idle.
LVL 22

Accepted Solution

ambience earned 2000 total points
ID: 37817506
High resolution has always been a problem under Windows and in general there is no guaranteed way to achieve microsecond level precise sleep for durations < 1ms. Windows 7 has User Mode Scheduling as described here http://msdn.microsoft.com/en-us/library/windows/desktop/dd627187%28v=vs.85%29.aspx. I'm absolutely unsure whether thats relevant or whether it could achieve higher performance compared to the system scheduler but apparently its designed to serve that purpose.

But even that would not guarantee anything because
By default the thread quantum on Windows NT based systems is about 100 milliseconds (I believe for servers). This means that a thread can “hog” the CPU for up to 100 milliseconds before another thread has a chance to be scheduled and actually execute.

BTW, why is it absolutely necessary to achieve switching time of <2us? How much time does the worker take to process and send back a reponse (if at all)? Do you think 6us of swtiching delay can be included in the time it takes to process requests?

If you are going to say no the response must be sent within 2us (for example) then its perhaps better not to have worker threads?


On a different note, if you are running on multiprocessor machine then perhaps you can assign different processor affinities to your IO and worker threads such that none would compete for the same processor. It would work because even if the threads quantum expires it would get assigned again to the same processor and wont have to wait. That would only peak one of the CPUs and the other one would be IO bound. I haven't tried such an arrangement but it might work.

Author Comment

ID: 37818226
I didn't aware when they introduced the User Mode Scheduling but this surely worth trying, although it is only supported in 64bit applications. Actually I am not quite sure about why it takes 6-7us to do the switching on my fairly idle machine. Whether it is normal context switching overhead or it's the implementation of windows event-wait mechanism? Having a chance to manage the scheduling might help finding it out. Thanks!

The worker threads simply call a blocking API, which immediately send request to another server and then block until server responds. It is the time when the request reach the server matters. There are definite number of API instance (each with different user logon) making up the throuttle rate serving a burst of several requests. That's why it has to be done in worker threads.
LVL 22

Expert Comment

ID: 37818831
But still trying to optimize for 3-4us - is this effort worth it? Does the rest of the code provide guaranteed bounded response times? I mean given delays can occur anywhere and a thread can hog all other threads, is this delay the only major obstacle?

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Monitor input from a computer is usually nothing special.  In this instance it prevented anyone from using the computer.  This was a preconfiguration that didn't work.
Stuck in voice control mode on your Amazon Firestick?  Here is how to turn it off!!!
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.
With the advent of Windows 10, Microsoft is pushing a Get Windows 10 icon into the notification area (system tray) of qualifying computers. There are many reasons for wanting to remove this icon. This two-part Experts Exchange video Micro Tutorial s…

569 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question