Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


OpenGL active wait problem (100%cpu usage..)

Posted on 2004-04-29
Medium Priority
Last Modified: 2013-12-06
I am seeing bizar sheduling behaviour with opengl on different types of graphics hardware. There is apparantly a big difference between nvidia and ati opengl drivers
as far as blocking is concerned. On ATI cards, the glswapbuffers command appears to be a blocking command, which caused the original problem. Opengl applications
using less than 5% cpu on a machine with an nvidia card were using 100% on a similar machine with an ATI card.

I induced an active wait in my renderloop, measuring the render time and trying to sleep the process till just before the vertical retrace or if that is disbled the requested
framerate. This works sometimes, there are some quirks though. For starters, the windows thread switching granularity and the precission of timers is very important,
but one can work around that and so I did. The result was that the same reference app was now using about 20 to 25 % cpu on systems with ATI cards.

But, switching from the reference app to something more intensive, demonstrated yet another problem. It appears that many opengl commands can block the cpu on
an ati equiped machine. On nvidia, the commands return immeadiately whereas on ATI the cpu blocks on unpredictable opengl commands. Nvidia has the NV_FENCE
extension that allows me to do fine grained synchronisation, so I have no problems there .. but there's no equivalent on ATI, so I'm all out of ideas as to how to get
reasonable CPU usage on ATI-equiped systems.

I was also wondering how tripple buffering fits into this picture.
Question by:enkimute
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
LVL 17

Expert Comment

ID: 11033544
I can't say I've ever seen what you describe.  Both companies use command buffering schemes to pass data off to the GPU, and swapbuffers might be a sync point but should never hold up the CPU.  However, there are other gl commands like flush that will hard sync, as will any sort of readPixels or equivalent from the backbuffer.

However, it is possible that if you have VSYNC turned on, swapbuffers becomes a sync-point for ATI (and for some reason not with NV).  I can check with the ATI engineering folks if this is a problem.  I assume you are using a modern (radeon) ATI card, and latest drivers?  I also assume you are running double-buffered applications.  Windowed or fullscreen might make a difference.  Any other 'special' extensions being used might make a difference.

You should never wait-state yourself.  The drivers should take care of VSYNC if you in fact do want it, and the swap should happen IN HARDWARE in fullscreen at least.  Possible that windowed mode apps work differently with VSYNC enabled.


Author Comment

ID: 11038714
I've been testing on latest radeon and nvidia cards .. with latest drivers.

I am using opengl 1.4 with many extensions, but the problems occur even in verry simple setups. Drawing a few primitives and calling swapbuffers will yield a 100% cpu
usage on ATI cards, 0% on nvidia .. I could send you samples, both source and binaries, but it's really too trivial ..

I've been scooping around on google, there's an article somewhere about the radeaon linux drivers that says they do an active wait till vsync because of timing issues ..
I couldn't find anything about the situation on windows though ..

I'm running my apps full screen, but there's no way of telling that to openGl, so unless the driver concludes it from the dc size, I guess the situation is the same windowed
or full screen ..  Maybe ATI always has tripple buffering enabled, causing my app to generate to many frames, eventually resulting in a blocking call (if the command buffers
are full)..

This problem has been puzzeling me for a couple of weeks now, I just want my simple ogl applications not to cludge up my cpu ..
LVL 17

Expert Comment

ID: 11041079
It's amazing that the most trivial of cases can show the worst performance.. ;)

Both vendors have some sort of frame-ahead buffering of the rendering queue.  That's so when they start going really fast, they can keep processing things optimally.  It may be that you are rendering something so small, so rapidly, that the ATI driver is doing a 'hard wait', while the NV driver is doing a 'soft wait' -- i.e., the ATI driver does a spinloop of sorts, while the NV driver still has control of the process tree but is giving up cycles to other threads.  Or you have a call that is instigating the issue.

I'd be more than happy to look at a small sample -- and if it's really doing something that seems incorrect, I can forward it to my friends at ATI and get a 'proper answer' for why you are getting this particular results.

Learn how to optimize MySQL for your business need

With the increasing importance of apps & networks in both business & personal interconnections, perfor. has become one of the key metrics of successful communication. This ebook is a hands-on business-case-driven guide to understanding MySQL query parameter tuning & database perf


Author Comment

ID: 11041609
Good, I'll send it to you personally. Please drop me a mail at coding_nospamn_@mutefantasies.com. That would
be without the _nospamn_ off course .. That way I have your email as I can't find it here anywhere ..
LVL 17

Expert Comment

ID: 11051400
I'll try to remember to forward this to ATI at some point, but here's the basic answer.

The GPU isn't doing any work.  All you do every frame is clear the buffer, which is an async flush in the hardware.  The NV driver probably does some sort of sleep internally, sleeping your process, if you get too far ahead of them.  The ATI driver is apparently hard-waiting when you get too far ahead.

The moral of the story is that you are running like 100,000 FPS or something extremely high, and at that point the driver is actually getting bottlenecked on the pure speed of requests, and does a block wait at some point.  When you start to add in real rendering code (and yes, triple buffering yourself), you shouldn't see any such bottleneck -- you'll always be doing enough work either in GPU rendering, or app-side logic, that you'll be below say 300fps (or more realistic, on the average machine, below 60fps).

Try adding in some basic amount of rendering into the Draw function.  Grab another few NeHe samples, see what it looks like.

If with 'real world' rendering submittals you continue to see an issue, you've got my email: ping me, and I'll DEFINITELY make sure that ATI takes a look.  Include a DXDIAG dump, as they'll want to see that.

Hope that helps!


Author Comment

ID: 11051736
off course it goes wrong when I do draw something too, the point is it _even_ goes wrong when you do nothing.
Furthermore, this app is only running free if you have vsync disabled. If VSync is enabled, it should run at exactly
the refresh rate, which it does, BUT .. it uses a 100%cpu while waiting for the sync. So you call glswapbuffers,
and it does not only lock the rendering thread (I'm fine with that), But it does not allow _ANY_ thread to execute
as the driver is performing an active wait. (resulting in a 100% cpu usage .. nomatter how little or much you are
drawing .. ).

But I'll send you a somewhat more extensive example that loads some models and does some rendering ...

really want to get to the bottom of this ...

grtz ..

LVL 17

Expert Comment

ID: 11053120
okay.  I'll take a look and try pinging ATI.  It will be at 100% CPU if you are basically doing nothing but thousands of frames per second of data.  If you are running at a framerate LESS than the vsync, and have vsync enabled, and it waits in the driver, that's bad.  If you are running OVER the vsync rate, which means you are providing an updated frame before the current frame is potentially flipped, and you aren't triple buffered, it could stall awaiting the vsync (since it needs the backbuffer unlocked to render to it!).

Make sure to send me a DXDIAG too, just in case they want that.

LVL 17

Accepted Solution

davebytes earned 1500 total points
ID: 11069443
I did send a ping off to ATI -- haven't gotten a response back as yet.  With E3 this week, and the X800 just out, they could be backed up (and mine was caveated as a low-priority question!).

Will let you know if I hear anything.


Author Comment

ID: 11072688
Thanks a bunch, I've been kind off very busy last couple of days, but I'll try to wrap up a nice little demo this weekend, so that we have something to test from.
So you can expect that in your mailbox one of these days..


Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

What is RenderMan: RenderMan is a not any particular piece of software. RenderMan is an industry standard, defining set of rules that any rendering software should use, to be RenderMan-compliant. Pixar's RenderMan is a flagship implementation of …
Recently, in one of the tech-blogs I usually read, I saw a post about the best-selling video games through history. The first place in the list is for the classic, extremely addictive Tetris. Well, a long time ago, in a galaxy far far away, I was…
Monitoring a network: how to monitor network services and why? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the philosophy behind service monitoring and why a handshake validation is critical in network monitoring. Software utilized …
We’ve all felt that sense of false security before—locking down external access to a database or component and feeling like we’ve done all we need to do to secure company data. But that feeling is fleeting. Attacks these days can happen in many w…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question