Solved

OpenGL active wait problem (100%cpu usage..)

Posted on 2004-04-29
10
2,683 Views
Last Modified: 2013-12-06
I am seeing bizar sheduling behaviour with opengl on different types of graphics hardware. There is apparantly a big difference between nvidia and ati opengl drivers
as far as blocking is concerned. On ATI cards, the glswapbuffers command appears to be a blocking command, which caused the original problem. Opengl applications
using less than 5% cpu on a machine with an nvidia card were using 100% on a similar machine with an ATI card.

I induced an active wait in my renderloop, measuring the render time and trying to sleep the process till just before the vertical retrace or if that is disbled the requested
framerate. This works sometimes, there are some quirks though. For starters, the windows thread switching granularity and the precission of timers is very important,
but one can work around that and so I did. The result was that the same reference app was now using about 20 to 25 % cpu on systems with ATI cards.

But, switching from the reference app to something more intensive, demonstrated yet another problem. It appears that many opengl commands can block the cpu on
an ati equiped machine. On nvidia, the commands return immeadiately whereas on ATI the cpu blocks on unpredictable opengl commands. Nvidia has the NV_FENCE
extension that allows me to do fine grained synchronisation, so I have no problems there .. but there's no equivalent on ATI, so I'm all out of ideas as to how to get
reasonable CPU usage on ATI-equiped systems.

I was also wondering how tripple buffering fits into this picture.
0
Comment
Question by:enkimute
  • 5
  • 4
10 Comments
 
LVL 17

Expert Comment

by:davebytes
Comment Utility
I can't say I've ever seen what you describe.  Both companies use command buffering schemes to pass data off to the GPU, and swapbuffers might be a sync point but should never hold up the CPU.  However, there are other gl commands like flush that will hard sync, as will any sort of readPixels or equivalent from the backbuffer.

However, it is possible that if you have VSYNC turned on, swapbuffers becomes a sync-point for ATI (and for some reason not with NV).  I can check with the ATI engineering folks if this is a problem.  I assume you are using a modern (radeon) ATI card, and latest drivers?  I also assume you are running double-buffered applications.  Windowed or fullscreen might make a difference.  Any other 'special' extensions being used might make a difference.

You should never wait-state yourself.  The drivers should take care of VSYNC if you in fact do want it, and the swap should happen IN HARDWARE in fullscreen at least.  Possible that windowed mode apps work differently with VSYNC enabled.

d
0
 
LVL 2

Author Comment

by:enkimute
Comment Utility
I've been testing on latest radeon and nvidia cards .. with latest drivers.

I am using opengl 1.4 with many extensions, but the problems occur even in verry simple setups. Drawing a few primitives and calling swapbuffers will yield a 100% cpu
usage on ATI cards, 0% on nvidia .. I could send you samples, both source and binaries, but it's really too trivial ..

I've been scooping around on google, there's an article somewhere about the radeaon linux drivers that says they do an active wait till vsync because of timing issues ..
I couldn't find anything about the situation on windows though ..

I'm running my apps full screen, but there's no way of telling that to openGl, so unless the driver concludes it from the dc size, I guess the situation is the same windowed
or full screen ..  Maybe ATI always has tripple buffering enabled, causing my app to generate to many frames, eventually resulting in a blocking call (if the command buffers
are full)..

This problem has been puzzeling me for a couple of weeks now, I just want my simple ogl applications not to cludge up my cpu ..
0
 
LVL 17

Expert Comment

by:davebytes
Comment Utility
It's amazing that the most trivial of cases can show the worst performance.. ;)

Both vendors have some sort of frame-ahead buffering of the rendering queue.  That's so when they start going really fast, they can keep processing things optimally.  It may be that you are rendering something so small, so rapidly, that the ATI driver is doing a 'hard wait', while the NV driver is doing a 'soft wait' -- i.e., the ATI driver does a spinloop of sorts, while the NV driver still has control of the process tree but is giving up cycles to other threads.  Or you have a call that is instigating the issue.

I'd be more than happy to look at a small sample -- and if it's really doing something that seems incorrect, I can forward it to my friends at ATI and get a 'proper answer' for why you are getting this particular results.

d
0
 
LVL 2

Author Comment

by:enkimute
Comment Utility
Good, I'll send it to you personally. Please drop me a mail at coding_nospamn_@mutefantasies.com. That would
be without the _nospamn_ off course .. That way I have your email as I can't find it here anywhere ..
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 
LVL 17

Expert Comment

by:davebytes
Comment Utility
I'll try to remember to forward this to ATI at some point, but here's the basic answer.

The GPU isn't doing any work.  All you do every frame is clear the buffer, which is an async flush in the hardware.  The NV driver probably does some sort of sleep internally, sleeping your process, if you get too far ahead of them.  The ATI driver is apparently hard-waiting when you get too far ahead.

The moral of the story is that you are running like 100,000 FPS or something extremely high, and at that point the driver is actually getting bottlenecked on the pure speed of requests, and does a block wait at some point.  When you start to add in real rendering code (and yes, triple buffering yourself), you shouldn't see any such bottleneck -- you'll always be doing enough work either in GPU rendering, or app-side logic, that you'll be below say 300fps (or more realistic, on the average machine, below 60fps).

Try adding in some basic amount of rendering into the Draw function.  Grab another few NeHe samples, see what it looks like.

If with 'real world' rendering submittals you continue to see an issue, you've got my email: ping me, and I'll DEFINITELY make sure that ATI takes a look.  Include a DXDIAG dump, as they'll want to see that.

Hope that helps!

-d
www.chait.net
0
 
LVL 2

Author Comment

by:enkimute
Comment Utility
off course it goes wrong when I do draw something too, the point is it _even_ goes wrong when you do nothing.
Furthermore, this app is only running free if you have vsync disabled. If VSync is enabled, it should run at exactly
the refresh rate, which it does, BUT .. it uses a 100%cpu while waiting for the sync. So you call glswapbuffers,
and it does not only lock the rendering thread (I'm fine with that), But it does not allow _ANY_ thread to execute
as the driver is performing an active wait. (resulting in a 100% cpu usage .. nomatter how little or much you are
drawing .. ).

But I'll send you a somewhat more extensive example that loads some models and does some rendering ...

really want to get to the bottom of this ...

grtz ..

0
 
LVL 17

Expert Comment

by:davebytes
Comment Utility
okay.  I'll take a look and try pinging ATI.  It will be at 100% CPU if you are basically doing nothing but thousands of frames per second of data.  If you are running at a framerate LESS than the vsync, and have vsync enabled, and it waits in the driver, that's bad.  If you are running OVER the vsync rate, which means you are providing an updated frame before the current frame is potentially flipped, and you aren't triple buffered, it could stall awaiting the vsync (since it needs the backbuffer unlocked to render to it!).

Make sure to send me a DXDIAG too, just in case they want that.

-d
0
 
LVL 17

Accepted Solution

by:
davebytes earned 500 total points
Comment Utility
I did send a ping off to ATI -- haven't gotten a response back as yet.  With E3 this week, and the X800 just out, they could be backed up (and mine was caveated as a low-priority question!).

Will let you know if I hear anything.

d
0
 
LVL 2

Author Comment

by:enkimute
Comment Utility
Thanks a bunch, I've been kind off very busy last couple of days, but I'll try to wrap up a nice little demo this weekend, so that we have something to test from.
So you can expect that in your mailbox one of these days..

enki
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

What is RenderMan: RenderMan is a not any particular piece of software. RenderMan is an industry standard, defining set of rules that any rendering software should use, to be RenderMan-compliant. Pixar's RenderMan is a flagship implementation of …
As game developers, we quickly learn that Artificial Intelligence (AI) doesn’t need to be so tough.  To reference Space Ghost: “Moltar, I have a giant brain that is able to reduce any complex machine into a simple yes or no answer. (http://www.youtu…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now