Solved

equivalent to SafeArrayAccessData in VB.NET?

Posted on 2011-09-21
12
1,009 Views
Last Modified: 2012-05-12

Hi, I am trying to extract data from a proprietary mass spectrometer data file using a vendor DLL.  Following, the vendor documentation in both VB.NET and in C++, our performance was exceedingly slow.  We reverse engineered an open source C++ reader of the same format with good performance, and found that they are accessing the arrays slightly differently, including the function SafeArrayAccessData, and it is about 1 million times faster.

The codebase that we are trying to access the data from is VB.NET, so we would like to apply an analogous change to our VB.NET code (first section below) to gain the same (mysterious to me) access and speed improvement in our reading of this data.  I know we can build a new (with some trouble) C++ module, but the strongly preferred approach is to try to keep it in VB.NET.  Does anyone have any ideas/suggestions?

I am bracing myself because I imagine this could require marshalling, p/invokes, whatever -- those have always been black art to me.  But some of you out there are very good at it and might have some thoughts.

Thanks VERY much for any help.

(Note: each of the three snippets below are doing the exact same thing against the exact same DLL -- only difference is the handling of arrays in C++)
*************************************************
Slow VB.NET code (takes 3 minutes for 4000 points)
*************************************************
(add DLL to references)

Imports DACSERVERLib

Dim sSpectrum As New DACSpectrum
sSpectrum.GetSpectrum(sRawFilePath, fnNum, proNum, scanCounter)

For n As Integer = 0 To sSpectrum.NumPeaks - 1

mass = sSpectrum.Masses(n)
       intensity = sSpectrum.Intensities(n)

Next

*************************************************
Slow C++ code (same timeframe)
*************************************************

#import "C:\MassLynx\DACServer.DLL" no_namespace named_guids

void slow_read()
{
	IDACSpectrumPtr pSpectrum;
	pSpectrum = IDACSpectrumPtr(CLSID_DACSpectrum);

	CComBSTR bstrFileName("C:\\Users\\crice\\Desktop\\test_satin_001.raw");

	int nFunctionNumber = 1;
	int nProcessNumber = 0;
	int nScanNumber = 1;

	// populate the dacspectrum object
	int nRtn = pSpectrum->GetSpectrum(bstrFileName.Detach(), nFunctionNumber, nProcessNumber, nScanNumber);

	float mass;
	float intensity;
	long numpeaks = pSpectrum->NumPeaks;

	for (int nPeakCount=0; nPeakCount<numpeaks; nPeakCount++)
	{
		mass = ((float*)pSpectrum->Masses.parray->pvData)[nPeakCount];
		intensity = ((float*) pSpectrum->Intensities.parray->pvData)[nPeakCount];
	}
}


*************************************************
Fast C++ code (almost instantaneously goes through same data)
Identical, except for intermediate use of SafeArrayAccessData
To create special pointers
*************************************************

#import "C:\MassLynx\DACServer.DLL" no_namespace named_guids

void fast_read()
{
	IDACSpectrumPtr pSpectrum;
	pSpectrum = IDACSpectrumPtr(CLSID_DACSpectrum);

	CComBSTR bstrFileName("C:\\Users\\crice\\Desktop\\test_satin_001.raw");

	int nFunctionNumber = 1;
	int nProcessNumber = 0;
	int nScanNumber = 1;

	// populate the dacspectrum object
	int nRtn = pSpectrum->GetSpectrum(bstrFileName.Detach(), nFunctionNumber, nProcessNumber, nScanNumber);

	float mass;
	float intensity;
	long numpeaks = pSpectrum->NumPeaks;

	VARIANT pfIntensities;
	VARIANT pfMasses;
	pSpectrum->get_Intensities(&pfIntensities);
	pSpectrum->get_Masses(&pfMasses);

	float HUGEP *intensityArrayPtr;
	float HUGEP *massArrayPtr;
	// lock safe arrays for access
	HRESULT hr;
	// TODO: check hr return value?
	hr = SafeArrayAccessData( pfIntensities.parray, (void HUGEP**)&intensityArrayPtr);
	hr = SafeArrayAccessData( pfMasses.parray,		(void HUGEP**)&massArrayPtr);

	for (long c=0; c<numpeaks; c++) {
		mass = massArrayPtr[c];
		intensity = intensityArrayPtr[c];
	}

}

*************************************************
Fast VB.NET code 
*************************************************

      ???????     
    ??:::::::??   
  ??:::::::::::?  
 ?:::::????:::::? 
 ?::::?    ?::::? 
 ?::::?     ?::::?
 ??????     ?::::?
           ?::::? 
          ?::::?  
         ?::::?   
        ?::::?    
       ?::::?     
       ?::::?     
       ??::??     
        ????      
                  
        ???       
       ??:??      
        ???

Open in new window

0
Comment
Question by:riceman0
  • 5
  • 4
  • 3
12 Comments
 
LVL 40

Expert Comment

by:Jacques Bourgeois (James Burger)
Comment Utility
There are 2 differences in the fast C++ code.

They changed the loop conter from an int to a long. On a 64 bits computer, that may help a little, but what would not explain "1 million times faster".

The second change is that they are also working with pointers. That is probably the big impact. Pointers enable you to speak directory with memory instead of going through the intermediates that are properties and collections.

Unfortunately, you need C++ to work directly with pointers. Because pointers are the biggest source of security problems in older applications, Microsoft took the good decision (as Java did long before them) to let go of pointer. Pointers are thus not available in managed code (.NET code).

One thing you might try in the VB code is a For Each loop instead of a loop with a counter. Depending on how DACSpectrum is implemented, For Each is sometimes faster. But you will never get the same speed upgrade as they did in C++. No wonder that C++ is still on the market. Java and C# use the same syntax, without the pointers, and they are fare easier to program, so everybody would have switched. For some very fast operations, you still need C++.
0
 

Author Comment

by:riceman0
Comment Utility

Interesting.  In the slow C++ code it seemed to me to be working with pointers too:

pSpectrum->Masses.parray->pvData

just converts it to a "safe" "locked" pointer before using it; which in my naive model seemed to mean it didn't have to go "find" it every time.

And I thought in .NET you could work with pointers, it just involves some measure of "marshalling" etc.  I believe I have (with help from this site usually) managed to access unmanaged data by reference via COM servers, etc.

So I was hoping someone skilled in the black arts would have some trick to achieve the same effect as SafeArrayAccessData.  I hear you, that you're saying this is not possible.  I'll leave this open a bit to see if anyone knows a trick, or something to try.

I like the idea with for/each, but don't see exactly how its enumerable that way, will play around.
0
 

Author Comment

by:riceman0
Comment Utility

Actually, here's an interesting article that seems to use SafeArrayAccessData itself as an API call from .NET...

http://bugslasher.net/2011/02/12/how-to-create-and-use-a-custom-com-marshaler-in-net/

but I'm out of my depth with it.  It's not often I leave the .NET womb, any expert opinions on the article vs my problem?
0
 
LVL 40

Accepted Solution

by:
Jacques Bourgeois (James Burger) earned 250 total points
Comment Utility
My C++ time is far away at the beginning of the 90's, almost 20 years ago I might have lost the touch a bit.

But looks to me that (float*) is a cast to a pointer. A cast is not exactly the same thing as a pointer (at least, it was not in Turbo C++). In in my memory, pSpectrum->Masses.parray->pvData was a pointer to a pointer to a pointer, not a direct route. And they cast in every loop. In the fast one, they first get the pointer and then loop reusing the same one.

In .NET, the "marshalling" is there to let you work with COM stuff. And the "measure of marshalling" can be involved. For instance, dates and strings need to be converted because they are not in the same format in COM and .NET. And although it might let you "play" with pointers, you are doing it indirectly, so you are loosing the performance that real pointers give you. Internally, the term "managed code" says it all, you cannot work with pointers. They are managed by the framework.
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
Comment Utility
I don't work this low level so I can't offer up any code...but C# does allow "unsafe" portions of code that do allow direct manipulation of pointers (not references).  If you can find C# code that does what you want then it can be placed into a DLL and used from a VB.Net application.
0
 

Author Comment

by:riceman0
Comment Utility
IdleMind, my belief was that VB.NET supports unsafe portions of code as well -- are you saying that C# has more capability than VB in this regard?
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 85

Expert Comment

by:Mike Tomlinson
Comment Utility
Yes...this is one of the few occasions where C# does actually have a capability that VB.Net does not.

To be clear, VB.Net does not support unsafe blocks.
0
 
LVL 40

Expert Comment

by:Jacques Bourgeois (James Burger)
Comment Utility
I confirm Idle Mind statement. But I would like to point out that their name says it all : UNSAFE.
0
 
LVL 40

Expert Comment

by:Jacques Bourgeois (James Burger)
Comment Utility
And should I add that this makes VB a better language than C# ;-)
0
 

Author Comment

by:riceman0
Comment Utility
Idlemind, do you have any ideas, keywords, or pointers as to how to duplicate the "SafeArrayAccessData" effect in C#?  It'd be much easier for me to incorporate a C# module into my project than a C++ module.

Thanks.
0
 
LVL 85

Assisted Solution

by:Mike Tomlinson
Mike Tomlinson earned 250 total points
Comment Utility
I wish I had some code for you...sorry!  =\
0
 

Author Closing Comment

by:riceman0
Comment Utility
I think this has run its course.  I'm going to open a new question, formulated differently.  Perhaps broken up into smaller questions.
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

It’s quite interesting for me as I worked with Excel using vb.net for some time. Here are some topics which I know want to share with others whom this might help. First of all if you are working with Excel then you need to Download the Following …
Many of us here at EE write code. Many of us write exceptional code; just as many of us write exception-prone code. As we all should know, exceptions are a mechanism for handling errors which are typically out of our control. From database errors, t…
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now