CharSet=CharSet.Ansi SLOW duinrg batch processing (with interop functionality) of 15,000 records

In my Windows app, I'm running a batch process that's composed of a FOR loop that'll run 15,000 times, copy 3 fields of data of each row to a struct, and send the struct to an external method (using DLLImport).  
The main problem is that the StructLayout attribute of this struct (which is also a parm to the method) is set at [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)] - this makes the process take over 25 mins. It also lets the external function write correctly to other fields of the struct.

If we change it to CharSet.Unicode, the process takes less than than 1 min. but it doesn't seem to write anything to the struct. After I run the process, I notice that the struct fields have strange characters.
Is there some way I can keep using Unicode to keep the speed? I need to be able to read the data in the struct after the dll call.

Thank you.
Who is Participating?
ptmcompConnect With a Mentor Commented:
Move this line: " ZIP4_PARM zip4Parm = new ZIP4_PARM(); " before the for-loop and make sure that all fields are reset. (I don't know if it really changes something)

Every memory copy / reallocation should be avoided.

May be explicit ANSI conversion is faster:
   byte[] bytes = new bytes(51);  // you should not reallocate this but keey the array and reuse it!
   ASCIIEncoding ascii = new ASCIIEncoding();
   int len = iadl.Lengh;
   if (iadl.Length > 51)
        len = 51;
   ascii.GetBytes(iadl1, 0, len, bytes, 0);

If this helps then replace the strings in the struct by byte arrays and move the converting code above in a function (e.g. ToASCII(string text, byte[] bytes, int maxLength))

Use a Windows.Forms.Timer to update the statusbar panel. If you're updating too often it just consumes too much CPU time. I would also run the loop in a seperate thead and then call Sleep(0) where you call DoEvents.
How many characters are you talking about here ? there is no reason why changing the char set should add 24 minutes ... is it possible that the dll is actually having an error and returning (thus not doing some of its processing) which is causing the speed gain .
MyersAAuthor Commented:
The DLL method returns an error code when it runs and in both cases (ANSI or Unicode) it returns 0 (Success). The struct itself holds many characters (several fields of 51 chars, several other fields of 30 chars, etc...) so in total it should be about 750 chars.
Basically, the process is to fill three fields (x, y, z) of the struct with string data ( a PO Box address, the city, and the state) and call the dll method with the struct as parm. The dll call modifies the struct by reading the fields I filled previously, doing a search, and modifying the struct with the information it found (more additional information about that address).  The batch consists of doing this 15,000 times.
I've been told that if the unmanaged call (the dll was made in C) expects ANSI but it receives Unicode, it'll mess up.
But the previous version of this application (which also used this DLL) made in MS C/C++ processed this whole batch in under 4 mins.   Which tells me that this Structlayout is adding way too much overhead. Is there any other alternative?

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

What plattform is it running on? WinNT, Win2000, etc are running on unicode and must convert to ansi and back if you are binding the Ansi function.
MyersAAuthor Commented:
My development machine is WindowsXP. We'll be running it in all types of Windows platforms (95, 98, 2000, XP, etc...)
After I create the executable, does it matter what platform I run the application in? For example, if I use Ansi, can I still run it in a Win95,win98 machine?
Also, the previous version of this application (which also used this DLL) made in MS C/C++ processed this whole batch in under 4 mins.  Is it possible
that changing the CharSet to ANSI can add over 30 mins. of overhead time?

You can use Charset.Auto which uses ANSI on 95/98/Me and Unicode on WinNT/2000/XP/2003
Depending on how much data needs to be converted it can be quite an overhead - but 30 Mins seem to be a lot. Anyway if it's not a lot of work check if there is a difference. Make sure to compare release builds without debug hooks! Debugging can slow down an application by factors.
MyersAAuthor Commented:
I switched to Charset.Auto but it works like Unicode (very fast processing  but the values returned are unreadable). So I had to leave it with Ansi. Is that possible?

How can I compare the Release builds without debug hooks? I've run the release build (by that, I mean the exe in the Release folder) in other PCs and it runs equally slow.

MyersAAuthor Commented:
I also noticed that at the end of the 15,000 records (at 12,000+) it starts getting really slow. And the CPU usage for it is at 90%. THe memory usage is high too.

Can you provide some code? It seems that there is more than one problem. What kind of component are you using (by the DLLImport)? Do you have the source code of it? If yes, can you translate it to C#? If not you should maybe use C++ which allows you to use ANSI (in C# every string is unicode) or use byte arrays instead of strings.
Is the high memory allocated by the C# code? Standard marshalling is BSTR -> the called function has to free the memory.
MyersAAuthor Commented:
I really wouldn't know if the high memory is allocated by the C# code...
This is the basic code. I know it doesn't display the best code practices (eg. use a foreach and string casting instead of ToString() ) :

public static extern int z4adrinq([In,Out][MarshalAs(UnmanagedType.LPStruct)] ZIP4_PARM zip4_parm);
private void btn_run_Click(object sender, System.EventArgs e)
    for (int i=0;i<Datatable_AuditList.Rows.Count;i++)  //count = 15,000
        sFBU = Datatable_AuditList.Rows[i]["FBU"].ToString();
        sDEL = Datatable_AuditList.Rows[i]["DEL"].ToString();
        sCTY = Datatable_AuditList.Rows[i]["CTY"].ToString();
        sZ4 = Datatable_AuditList.Rows[i]["Z4"].ToString();
        ZIP4_PARM zip4Parm = new ZIP4_PARM();
        zip4Parm.iprurb = sFBU;
        zip4Parm.iadl1 = sDEL;
        zip4Parm.ictyi = sCTY;
        int iError = z4adrinq(zip4Parm);   //dll call - zip4Parm's 60+ fields are modified
        statusBar.Panels[0].Text = "Row Count = " + iRecCount.ToString() + "rows ";
This is the struct. It's a bit bigger but the members are all basically the same:
namespace ZipMaster
     [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
     public class ZIP4_PARM
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=4  )]
          public string     rsvd0;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=51 )]
          public string     iadl1;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=51 )]
          public string     iadl2;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=51 )]
          public string     ictyi;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=3  )]
          public string     istai;
          public string     county;
          public short   respn;
          public char     retcc;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=12 )]
          public string     adrkey;
          public char       auto_zone_ind;
          public struct footer
               public char  a;
               public char  b;
               public char  c;
               public char  d;
               public char  e;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=6)]
          public string  rsvd3;
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.