Solved

CreateDIBSection is very slow - how to change "Hardware Acceleration"

Posted on 2006-07-05
33
1,567 Views
Last Modified: 2013-11-20
OK, I found that if I set the "Hardware Acceleration" to "None" than it makes CreateDIBSection to work much faster
(40ms vs 400ms).
But I found that it causes games to utilitize much more CPU (in some cases).
Now I change the "Hardware Acceleration" from the registry (and I need to restart the computer).

Is there any API call that I can set the "Hardware Acceleration" to None before I use the CreateDIBSection, and then (after the call) return it back to Full ?
0
Comment
Question by:VapiSoft
  • 14
  • 12
  • 6
33 Comments
 
LVL 22

Expert Comment

by:mahesh1402
ID: 17048397
>>Is there any API call that I can set the "Hardware Acceleration" to None

I think there's no specifc API for this, and that the Display control panel applet just modifies the registry and/or SYSTEM.INI.  I'd also assume what exactly it modifies varies between operating system versions.

>>Now I change the "Hardware Acceleration" from the registry

What you changed from registry manually that you can also change registry key programatically using MFC class such as CRegKey..but then as you said you need to restart machine means changing this key's value manually / programatically does not instantaneous set the computer's hardware acceleration level..so i think its not useful for your purpose..and Other way is to implement hack to adjust hardware acceleration level slider programatically  but not sure abt this too.

What I found closer to this is.. using DDK functions ...EngQueryDeviceAttribute instructs the video driver to look the registry key's value for the hardware
acceleration level.   And DrvNotify instructs the video driver to set the hardware acceleration level to that value. BUT I am not sure of this ..
As given here http://www.osronline.com/DDKx/graphics/dpyddi_2sh3.htm <== EngQueryDeviceAttribute to query the current acceleration level and DrvNotify change the acceleration level..

But it seems there is no direct API to implement this.

-MAHESH
0
 

Author Comment

by:VapiSoft
ID: 17048466
Hi MAHESH,

Sorroy, but as I understand the DrvNotify does not set the QDA_ACCELERATION_LEVEL but only notifies when there is a change.
I also did not understand the parameters of the EngQueryDeviceAttribute.
It tells me to get hdev from DrvCompletePDEV but there it is an IN parameter.
The only thing I have is hDC, is it the same as hdev?

In any case, I don't understand how other apps work with this CreateDIBSection if it takes so much time ???

0
 
LVL 22

Expert Comment

by:mahesh1402
ID: 17048480
>>CreateDIBSection is very slow

I suggest you to look as an Alternative to this using 'DrawDib' family of functions. That's what AVI playing engine uses..

Refer :
http://windowssdk.msdn.microsoft.com/en-us/library/ms708083.aspx
http://windowssdk.msdn.microsoft.com/en-us/library/ms708163.aspx

specially DrawDibDraw()  method is faster..

-MAHESH
0
 
LVL 22

Expert Comment

by:mahesh1402
ID: 17048527
Also NOTE : For example suppose your bitmap is 8-bpp and screen is 24-bpp, for every pixel GDI/driver needs to do a table lookup, which is not as fast as a memory copy. SO always try to change your bitmap to be the same as current display for better performance.

-MAHESH


0
 

Author Comment

by:VapiSoft
ID: 17048592
The problem is that I need to get the DIB from the screen (I am using GetDIBits).
I try to do it without CreateDIBSection and the GetDIBits took all the time.
This is catch 22.
0
 

Author Comment

by:VapiSoft
ID: 17048607
For two reasons (disk space and comparison) I need it in 8 bits, but when "Hardware Acceleration" is none it takes about 1/10 time (full screen about 40ms) which is almost bearable.
0
 
LVL 22

Expert Comment

by:mahesh1402
ID: 17048620
It seems no API is avail to set Hardware Acceleration to none....As I said you may look at alternative such as above DrawDib family functions OR..otherwise DirectX-DirectDraw..

-MAHESH
0
 
LVL 22

Expert Comment

by:mahesh1402
ID: 17056370
Have a look at example code links I have given on your other question :
http://www.experts-exchange.com/Programming/Programming_Languages/MFC/Q_21910717.html

-MAHESH
0
 

Author Comment

by:VapiSoft
ID: 17056745
OK, I am looking at them.
I also found out that the CPU time is "wasted" in the "BitBlt" see the following.

HDC memDC   =CreateCompatibleDC(hdc);
 HBITMAP hbmp=getDIBS(hdc, wr.cx, wr.cy); // Here I do CreateDIBSection
      DWORD t3=GetTickCount();
 HBITMAP oldBmp=(HBITMAP) SelectObject(memDC,hbmp);
      DWORD t31=GetTickCount();

==================
 BitBlt(memDC,0,0, wr.cx, wr.cy, hdc, x_offset, y_offset, SRCCOPY); // This takes 800 ms
==================
      DWORD t32=GetTickCount();
 oldBmp=(HBITMAP) SelectObject(memDC,hbmp);
      DWORD t4=GetTickCount();
0
 
LVL 22

Expert Comment

by:mahesh1402
ID: 17056790
If you really mean that much faster, use DirectX as suggested. You can get a pointer directly into video memory and manage the bits yourself.

-MAHESH
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17086569
Off hand, I don't know why the video accelleration setting should make a difference...

However, it seems to me that the problem boils down to the fact that at some point in the sequence (you or some process) is needing to convert 24- or 32-bpp data to 8-bpp data.  That takes a lot of CPU... to build a palette that minimizes color artifacts and color loss.  It all takes place in one call, whether it is a bitblt to a 8-bbp target bitmap or a GetDiBits call.  As the documentation for that function says...

    >> If the requested format for the DIB matches its internal format, the RGB values for
    >> the bitmap are copied. If the requested format doesn't match the internal format, a
    >> color table is synthesized.

A direct copy of the 32-bit data is certain to be 10 times faster that a GetDIBits call that converts from true color to 8bpp palletized colors.

I'll bet that your best (fastest) bet would be to stay with 32-bit colors at every step of the way.  If you are doing something like transferring whole screens (as with PC-Anywhere) then you can identify the changed subset of the screen and then compress the data as the final step before transporting it.

-- Dan
0
 

Author Comment

by:VapiSoft
ID: 17088075
It is not the problem because if look at the code, when I do the BitBlt the code the code is not doing any color conversion. Only after that in the GetDIBis , I do the conversion.
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17092533
As I pointed out, BitBlt does the conversion "behind the scenes" so to speak.  It is a time-consuming process.

What options are you using in your calls to CreateDIBSection?
0
 

Author Comment

by:VapiSoft
ID: 17093267
Currently an not using the CreateDIBSections.

HDC hdc=GetWindowDC(hwnd);
HDC memDC =CreateCompatibleDC(hdc);
HBITMAP hbmp=CreateCompatibleBitmap(hdc, wr.cx, wr.cy);  // getDIBS(hdc,wr.cx, wr.cy);
DWORD t3=GetTickCount();
HBITMAP oldBmp=(HBITMAP) SelectObject(memDC,hbmp);
DWORD t31=GetTickCount();
BitBlt(memDC,0,0, wr.cx, wr.cy, hdc, x_offset, y_offset, SRCCOPY); <== This is where it takes 400ms
DWORD t32=GetTickCount();
DWORD t4=GetTickCount();

msg->hDIB=getPictureHandle(hbmp,memDC,wr.cy,wr.cx,msg->pixels); <== Here I do the GetDIBits

DWORD t5=GetTickCount();


Before that I used the CreateDIBSection in getDIBS, but it did not change anything.

HBITMAP getDIBS(HDC hdc, int w2, int h2)
{
 BITMAPINFO bi;
      
      bi.bmiHeader.biSize            = sizeof(BITMAPINFOHEADER);
      bi.bmiHeader.biWidth           = w2;
      bi.bmiHeader.biHeight          = h2;
      bi.bmiHeader.biPlanes          = 1;
      bi.bmiHeader.biBitCount        = 32; // Bitmap.bmBitsPixel;
      bi.bmiHeader.biCompression     = 0;
      bi.bmiHeader.biSizeImage       = 0;
      bi.bmiHeader.biXPelsPerMeter   = 0;
      bi.bmiHeader.biYPelsPerMeter   = 0;
      bi.bmiHeader.biClrUsed         = 0;
      bi.bmiHeader.biClrImportant    = 0;
      bi.bmiHeader.biSizeImage=0;

      void *start;
      return CreateDIBSection(hdc,(LPBITMAPINFO) &bi,DIB_RGB_COLORS,&start,0,0);
}


0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17096152
Here is my test run:

void CD22Dlg::OnButton1()
{
      CRect wr(0,0, 1000,1000 );
      int x_offset= 0;
      int y_offset= 0;
      
      HDC     hdc=   ::GetWindowDC( 0 );
      HDC     memDC= CreateCompatibleDC(hdc);
      HBITMAP hbmp=   CreateCompatibleBitmap( hdc, wr.Width(), wr.Height() );  // getDIBS(hdc,wr.cx, wr.cy);
      HBITMAP oldBmp=(HBITMAP) SelectObject(memDC,hbmp);

  DWORD   t30=GetTickCount();
      BOOL    fOK= BitBlt(memDC,0,0, wr.Width(), wr.Height(), hdc, x_offset, y_offset, SRCCOPY); //<== This is where it takes 400ms
  DWORD   t31= GetTickCount();

      DWORD nTicks30= t31-t30;

      BITMAP rBmp;
      GetObject( hbmp, sizeof(BITMAP), &rBmp );

      BITMAPINFO bi;
      memset( &bi, 0, sizeof(BITMAPINFOHEADER) );  
      bi.bmiHeader.biSize= sizeof(BITMAPINFOHEADER); // 40
      bi.bmiHeader.biWidth= rBmp.bmWidth;            // 1000
      bi.bmiHeader.biHeight= rBmp.bmHeight;          // 1000
      bi.bmiHeader.biPlanes= rBmp.bmPlanes;          // 1
      bi.bmiHeader.biBitCount= rBmp.bmBitsPixel;     // 32

      BYTE* pBuf= new BYTE[ 1000*1000*4 ];
  DWORD t40=GetTickCount();
      int n= GetDIBits( memDC, hbmp, 0, 1000, pBuf, &bi, DIB_RGB_COLORS );
  DWORD t41=GetTickCount();

      DWORD nTicks40= t41-t40;

      BYTE* pBits= 0;

  DWORD t50=GetTickCount();
      HBITMAP hbm2= CreateDIBSection( hdc, &bi, DIB_RGB_COLORS, (void**)&pBits, 0,0 );
  DWORD t51=GetTickCount();

      DWORD nTicks50= t51-t50;
}

==-=-=-=-=-=-=-=-=-
Both nTicks30 ( BitBlt to a memDC) and nTicks30 (CreateDIBSection) come out as 0 -- indicating less time than GetTickCount timer resolution (I think that is about 15ms).

nTicks40 (GetDIBits) took 95ms.

==-=-=-=-=-=-=-=-=-
You could get much larger values if the size of the bitmaps are huge and/or if you are grabbing a rectangle stat does not start on a multiple of cfour.  What are your values for wr.xx, wr.cy, x_offset and y_offset ?

-- Dan
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17096179
The above times were with hardware accelleration set to FULL.

When I set the values down to "NONE" then nTicks40 (GetDIBits) wnnt DOWN to 15ms.  And Both nTicks30 (BitBlt to a memDC) went UP to 15ms.

-- Dan
0
What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

 

Author Comment

by:VapiSoft
ID: 17097787
Hi Dan,

As you can see, the first part is almost exacly like my code (except that I used GetWindowDc(hwnd) and you used 0).
Therefore, as I excpected, it still has the exact same problem.
The BitBlt takes about 350-400ms.

I checked it in other computers and it alwas the same (when Hardware Acceleration is set to Full).
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17103437
It's interesting that I actually saw the opposite effect.  Turning hardware accelleration to NONE INCREASED the time.  Such discrepancies are obviously related to the device driver and one would expect them to be machine-dependant (i.e, there may be nothing you can do about it other than trying very hard to optimize -- avoid accesses that are not needed).

I'll go ahead and ask these again...

   Are your  bitmaps huge?  
   Are you grabbing a rectangle that does not start on a multiple of four?
   What are your values for wr.xx, wr.cy, x_offset and y_offset ?
0
 

Author Comment

by:VapiSoft
ID: 17103607
Hi Dan,

No I an just trying to capture a window about the size of the desktop
wr.cx=1024
wr.cy=738.
x_offset=4
 y_offset =4

You tell me that for you it woks fast.
But for it works very very slow. So now I start to othink that maybe something is wrong with my libraries or the compiler settings.


0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17103993
Most drivers are optimized for BitBlits that move the entire bitmap.

Just as a test, see if you notice any difference when x_offset and y_offset are 0.
0
 

Author Comment

by:VapiSoft
ID: 17104228
I tried it, it doesn't make any difference.
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17104614
It does make one wonder how programs like VNC and PC Anywhere do it so quickly.  My bet is that they integrate into the device drivers to know what parts of the screen are changing ... as they change.  Thus, rather than taking snapshot after snapshot and finding differences, they know that they can confine themselves to certain (often small) areas of the screen to get all the work done about 10 times per second.

I believe that the source code for various versions of VNC are freely available, for instance here:
   http://www.koders.com/info.aspx?c=ProjectInfo&pid=AC8QNT72FM4FVWVYFKGLQ6G1LG
... just in case you want to persue that option; that is, of you want to  see how these same problems have been solved by others in the past.

-- Dan
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17104673
In fact, after looking at the source, it is clear that they install a mumber of Windows Hooks and keep track of messages such as WM_PAINT to stay informed about which parts of the screen have changed.
   http://www.koders.com/c++/fid8BC8E6705291CC1F8A85F59A94C9AA0E32BA4B79.aspx

They also use a "smart" communications protocol that lets the client do a lot of the work -- based on short data packets that describe screen changes (rather than brute-force reproducing the entire screen).
0
 

Author Comment

by:VapiSoft
ID: 17106031
Hi Dan,

My problem is not to know what part of the screen is changed. I also install hooks (although now I inderstand that it is a problem in Vista). I will look at the source to see how they read the screen.
0
 

Author Comment

by:VapiSoft
ID: 17106254
I checked the code (vncDesktop.cpp) they do exacly like I do.
So again I am confused, why the hell my code is so slow???
0
 

Author Comment

by:VapiSoft
ID: 17106531
I wanted to see if they have the same problem, so I downloaded the Setup file.
It installs two exe files WinVNC.exe wich is the server, and vncviewer.exe.
I did not understand how do I install the client and in general how do I work with this.
So I could not see if they have the same problem (slow).
Do you understand how to work with it?

0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17115385
I haven't used VNC.  It seems like setup ought to be fairly straightforward and would be covered in the docs.  Here's a FAQ:
   http://faq.gotomyvnc.com/fom-serve/cache/1.html

THis one will be of interest to you:
   Is VNC always this slow?
   http://faq.gotomyvnc.com/fom-serve/cache/58.html
   ... it is necessary that you completely disable "Hardware Acceleration" on the machines that run WinVNC (server). ...
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17115413
Also, the Google Groups search:
  http://groups.google.com/groups?lnk=hpsg&hl=en&q=VNC+%22Hardware+Acceleration%22
turns up this
   Hardware Acceleration disable in code?
   http://groups.google.com/group/microsoft.public.win32.programmer.gdi/browse_frm/thread/dcdad404e2283329
and a number of other threads that are relevant.  In one...

   The reason it speeds things up for VNC is that it goes through the
   normal GDI functions to blit stuff to the screen.  VNC hooks into this
   code to tell which portions of the screen have been updated.  When
   Windows uses acceleration to draw the desktop, it bypasses this library
   and writes directly to the video card.

   VNC can still detect these changes by polling the video card RAM for
   changes to the screen, but this is slow since it has to go across the
   PCI/AGP bus
0
 

Author Comment

by:VapiSoft
ID: 17115443
I did not understand the last part but I think that you are wrong about why it is slow.
What I saw is that is uses BitBlt as I do and this is very slow when the Hardware acceleration is on.
But I know that there are other programs like PC-Anywhere that works OK.
The question is how they overcome the problem.
0
 
LVL 49

Accepted Solution

by:
DanRollins earned 500 total points
ID: 17115815
>> I think that you are wrong about why it is slow.

It is slow because your device driver's implementation of bitblt is slow.  I believe that PC Anywhere installs a special device driver that allows it to avoid most of the time-consuming access of video memory.

One way to test that:  
Create two memDCs and time how long it takes to blit between them and compare that to the time it takes to blit the same amount of data from the real screen to a memDC.
0
 

Author Comment

by:VapiSoft
ID: 17115840
I checked it, it takes 0ms to do it.
From what I understood replacing the Display device driver is very complicated and very risky.
0
 
LVL 49

Expert Comment

by:DanRollins
ID: 17119424
Yes.  Plus your would need to write a video device driver, which is complicated.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Introduction: Ownerdraw of the grid button.  A singleton class implentation and usage. Continuing from the fifth article about sudoku.   Open the project in visual studio. Go to the class view – CGridButton should be visible as a class.  R…
If you use Adobe Reader X it is possible you can't open OLE PDF documents in the standard. The reason is the 'save box mode' in adobe reader X. Many people think the protected Mode of adobe reader x is only to stop the write access. But this fe…
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now