Solved

CharSet=CharSet.Ansi SLOW duinrg batch processing (with interop functionality) of 15,000 records

Posted on 2004-04-14
14
219 Views
Last Modified: 2012-06-21
In my Windows app, I'm running a batch process that's composed of a FOR loop that'll run 15,000 times, copy 3 fields of data of each row to a struct, and send the struct to an external method (using DLLImport).  
The main problem is that the StructLayout attribute of this struct (which is also a parm to the method) is set at [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)] - this makes the process take over 25 mins. It also lets the external function write correctly to other fields of the struct.

If we change it to CharSet.Unicode, the process takes less than than 1 min. but it doesn't seem to write anything to the struct. After I run the process, I notice that the struct fields have strange characters.
Is there some way I can keep using Unicode to keep the speed? I need to be able to read the data in the struct after the dll call.

Thank you.
0
Comment
Question by:MyersA
  • 6
  • 5
14 Comments
 
LVL 37

Expert Comment

by:gregoryyoung
ID: 10825436
How many characters are you talking about here ? there is no reason why changing the char set should add 24 minutes ... is it possible that the dll is actually having an error and returning (thus not doing some of its processing) which is causing the speed gain .
0
 
LVL 2

Author Comment

by:MyersA
ID: 10825901
The DLL method returns an error code when it runs and in both cases (ANSI or Unicode) it returns 0 (Success). The struct itself holds many characters (several fields of 51 chars, several other fields of 30 chars, etc...) so in total it should be about 750 chars.
Basically, the process is to fill three fields (x, y, z) of the struct with string data ( a PO Box address, the city, and the state) and call the dll method with the struct as parm. The dll call modifies the struct by reading the fields I filled previously, doing a search, and modifying the struct with the information it found (more additional information about that address).  The batch consists of doing this 15,000 times.
I've been told that if the unmanaged call (the dll was made in C) expects ANSI but it receives Unicode, it'll mess up.
But the previous version of this application (which also used this DLL) made in MS C/C++ processed this whole batch in under 4 mins.   Which tells me that this Structlayout is adding way too much overhead. Is there any other alternative?

Vaughn
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10826151
What plattform is it running on? WinNT, Win2000, etc are running on unicode and must convert to ansi and back if you are binding the Ansi function.
0
 
LVL 2

Author Comment

by:MyersA
ID: 10827730
My development machine is WindowsXP. We'll be running it in all types of Windows platforms (95, 98, 2000, XP, etc...)
After I create the executable, does it matter what platform I run the application in? For example, if I use Ansi, can I still run it in a Win95,win98 machine?
Also, the previous version of this application (which also used this DLL) made in MS C/C++ processed this whole batch in under 4 mins.  Is it possible
that changing the CharSet to ANSI can add over 30 mins. of overhead time?

Vaughn
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10831005
You can use Charset.Auto which uses ANSI on 95/98/Me and Unicode on WinNT/2000/XP/2003
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10831015
Depending on how much data needs to be converted it can be quite an overhead - but 30 Mins seem to be a lot. Anyway if it's not a lot of work check if there is a difference. Make sure to compare release builds without debug hooks! Debugging can slow down an application by factors.
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 2

Author Comment

by:MyersA
ID: 10834682
I switched to Charset.Auto but it works like Unicode (very fast processing  but the values returned are unreadable). So I had to leave it with Ansi. Is that possible?

How can I compare the Release builds without debug hooks? I've run the release build (by that, I mean the exe in the Release folder) in other PCs and it runs equally slow.

thanks.
0
 
LVL 2

Author Comment

by:MyersA
ID: 10834950
I also noticed that at the end of the 15,000 records (at 12,000+) it starts getting really slow. And the CPU usage for it is at 90%. THe memory usage is high too.

Vaughn
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10835452
Can you provide some code? It seems that there is more than one problem. What kind of component are you using (by the DLLImport)? Do you have the source code of it? If yes, can you translate it to C#? If not you should maybe use C++ which allows you to use ANSI (in C# every string is unicode) or use byte arrays instead of strings.
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10835473
Is the high memory allocated by the C# code? Standard marshalling is BSTR -> the called function has to free the memory.
0
 
LVL 2

Author Comment

by:MyersA
ID: 10837148
I really wouldn't know if the high memory is allocated by the C# code...
This is the basic code. I know it doesn't display the best code practices (eg. use a foreach and string casting instead of ToString() ) :

[DllImport("ZIP4_W32.DLL")]
public static extern int z4adrinq([In,Out][MarshalAs(UnmanagedType.LPStruct)] ZIP4_PARM zip4_parm);
private void btn_run_Click(object sender, System.EventArgs e)
{
.....
    for (int i=0;i<Datatable_AuditList.Rows.Count;i++)  //count = 15,000
    {
        Application.DoEvents();
        sFBU = Datatable_AuditList.Rows[i]["FBU"].ToString();
        sDEL = Datatable_AuditList.Rows[i]["DEL"].ToString();
        sCTY = Datatable_AuditList.Rows[i]["CTY"].ToString();
        sZ4 = Datatable_AuditList.Rows[i]["Z4"].ToString();
        ZIP4_PARM zip4Parm = new ZIP4_PARM();
        zip4Parm.iprurb = sFBU;
        zip4Parm.iadl1 = sDEL;
        zip4Parm.ictyi = sCTY;
        int iError = z4adrinq(zip4Parm);   //dll call - zip4Parm's 60+ fields are modified
        statusBar.Panels[0].Text = "Row Count = " + iRecCount.ToString() + "rows ";
        iRecCount++;
    }
}
--------------
This is the struct. It's a bit bigger but the members are all basically the same:
namespace ZipMaster
{
...
     [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
     public class ZIP4_PARM
     {
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=4  )]
          public string     rsvd0;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=51 )]
          public string     iadl1;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=51 )]
          public string     iadl2;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=51 )]
          public string     ictyi;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=3  )]
          public string     istai;
          public string     county;
          public short   respn;
          public char     retcc;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=12 )]
          public string     adrkey;
          public char       auto_zone_ind;
          public struct footer
          {
               public char  a;
               public char  b;
               public char  c;
               public char  d;
               public char  e;
          }
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=6)]
          public string  rsvd3;
     }
}
0
 
LVL 10

Accepted Solution

by:
ptmcomp earned 125 total points
ID: 10839988
Move this line: " ZIP4_PARM zip4Parm = new ZIP4_PARM(); " before the for-loop and make sure that all fields are reset. (I don't know if it really changes something)

Every memory copy / reallocation should be avoided.

May be explicit ANSI conversion is faster:
   byte[] bytes = new bytes(51);  // you should not reallocate this but keey the array and reuse it!
   ASCIIEncoding ascii = new ASCIIEncoding();
   int len = iadl.Lengh;
   if (iadl.Length > 51)
   {
        len = 51;
   }
   bytes.Clear();
   ascii.GetBytes(iadl1, 0, len, bytes, 0);

If this helps then replace the strings in the struct by byte arrays and move the converting code above in a function (e.g. ToASCII(string text, byte[] bytes, int maxLength))

Use a Windows.Forms.Timer to update the statusbar panel. If you're updating too often it just consumes too much CPU time. I would also run the loop in a seperate thead and then call Sleep(0) where you call DoEvents.
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

In order to hide the "ugly" records selectors (triangles) in the rowheaders, here are some suggestions. Microsoft doesn't have a direct method/property to do it. You can only hide the rowheader column. First solution, the easy way The first sol…
It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now