Solved

CharSet=CharSet.Ansi SLOW duinrg batch processing (with interop functionality) of 15,000 records

Posted on 2004-04-14
14
225 Views
Last Modified: 2012-06-21
In my Windows app, I'm running a batch process that's composed of a FOR loop that'll run 15,000 times, copy 3 fields of data of each row to a struct, and send the struct to an external method (using DLLImport).  
The main problem is that the StructLayout attribute of this struct (which is also a parm to the method) is set at [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)] - this makes the process take over 25 mins. It also lets the external function write correctly to other fields of the struct.

If we change it to CharSet.Unicode, the process takes less than than 1 min. but it doesn't seem to write anything to the struct. After I run the process, I notice that the struct fields have strange characters.
Is there some way I can keep using Unicode to keep the speed? I need to be able to read the data in the struct after the dll call.

Thank you.
0
Comment
Question by:MyersA
  • 6
  • 5
14 Comments
 
LVL 37

Expert Comment

by:gregoryyoung
ID: 10825436
How many characters are you talking about here ? there is no reason why changing the char set should add 24 minutes ... is it possible that the dll is actually having an error and returning (thus not doing some of its processing) which is causing the speed gain .
0
 
LVL 2

Author Comment

by:MyersA
ID: 10825901
The DLL method returns an error code when it runs and in both cases (ANSI or Unicode) it returns 0 (Success). The struct itself holds many characters (several fields of 51 chars, several other fields of 30 chars, etc...) so in total it should be about 750 chars.
Basically, the process is to fill three fields (x, y, z) of the struct with string data ( a PO Box address, the city, and the state) and call the dll method with the struct as parm. The dll call modifies the struct by reading the fields I filled previously, doing a search, and modifying the struct with the information it found (more additional information about that address).  The batch consists of doing this 15,000 times.
I've been told that if the unmanaged call (the dll was made in C) expects ANSI but it receives Unicode, it'll mess up.
But the previous version of this application (which also used this DLL) made in MS C/C++ processed this whole batch in under 4 mins.   Which tells me that this Structlayout is adding way too much overhead. Is there any other alternative?

Vaughn
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10826151
What plattform is it running on? WinNT, Win2000, etc are running on unicode and must convert to ansi and back if you are binding the Ansi function.
0
Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

 
LVL 2

Author Comment

by:MyersA
ID: 10827730
My development machine is WindowsXP. We'll be running it in all types of Windows platforms (95, 98, 2000, XP, etc...)
After I create the executable, does it matter what platform I run the application in? For example, if I use Ansi, can I still run it in a Win95,win98 machine?
Also, the previous version of this application (which also used this DLL) made in MS C/C++ processed this whole batch in under 4 mins.  Is it possible
that changing the CharSet to ANSI can add over 30 mins. of overhead time?

Vaughn
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10831005
You can use Charset.Auto which uses ANSI on 95/98/Me and Unicode on WinNT/2000/XP/2003
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10831015
Depending on how much data needs to be converted it can be quite an overhead - but 30 Mins seem to be a lot. Anyway if it's not a lot of work check if there is a difference. Make sure to compare release builds without debug hooks! Debugging can slow down an application by factors.
0
 
LVL 2

Author Comment

by:MyersA
ID: 10834682
I switched to Charset.Auto but it works like Unicode (very fast processing  but the values returned are unreadable). So I had to leave it with Ansi. Is that possible?

How can I compare the Release builds without debug hooks? I've run the release build (by that, I mean the exe in the Release folder) in other PCs and it runs equally slow.

thanks.
0
 
LVL 2

Author Comment

by:MyersA
ID: 10834950
I also noticed that at the end of the 15,000 records (at 12,000+) it starts getting really slow. And the CPU usage for it is at 90%. THe memory usage is high too.

Vaughn
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10835452
Can you provide some code? It seems that there is more than one problem. What kind of component are you using (by the DLLImport)? Do you have the source code of it? If yes, can you translate it to C#? If not you should maybe use C++ which allows you to use ANSI (in C# every string is unicode) or use byte arrays instead of strings.
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 10835473
Is the high memory allocated by the C# code? Standard marshalling is BSTR -> the called function has to free the memory.
0
 
LVL 2

Author Comment

by:MyersA
ID: 10837148
I really wouldn't know if the high memory is allocated by the C# code...
This is the basic code. I know it doesn't display the best code practices (eg. use a foreach and string casting instead of ToString() ) :

[DllImport("ZIP4_W32.DLL")]
public static extern int z4adrinq([In,Out][MarshalAs(UnmanagedType.LPStruct)] ZIP4_PARM zip4_parm);
private void btn_run_Click(object sender, System.EventArgs e)
{
.....
    for (int i=0;i<Datatable_AuditList.Rows.Count;i++)  //count = 15,000
    {
        Application.DoEvents();
        sFBU = Datatable_AuditList.Rows[i]["FBU"].ToString();
        sDEL = Datatable_AuditList.Rows[i]["DEL"].ToString();
        sCTY = Datatable_AuditList.Rows[i]["CTY"].ToString();
        sZ4 = Datatable_AuditList.Rows[i]["Z4"].ToString();
        ZIP4_PARM zip4Parm = new ZIP4_PARM();
        zip4Parm.iprurb = sFBU;
        zip4Parm.iadl1 = sDEL;
        zip4Parm.ictyi = sCTY;
        int iError = z4adrinq(zip4Parm);   //dll call - zip4Parm's 60+ fields are modified
        statusBar.Panels[0].Text = "Row Count = " + iRecCount.ToString() + "rows ";
        iRecCount++;
    }
}
--------------
This is the struct. It's a bit bigger but the members are all basically the same:
namespace ZipMaster
{
...
     [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Ansi)]
     public class ZIP4_PARM
     {
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=4  )]
          public string     rsvd0;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=51 )]
          public string     iadl1;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=51 )]
          public string     iadl2;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=51 )]
          public string     ictyi;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=3  )]
          public string     istai;
          public string     county;
          public short   respn;
          public char     retcc;
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=12 )]
          public string     adrkey;
          public char       auto_zone_ind;
          public struct footer
          {
               public char  a;
               public char  b;
               public char  c;
               public char  d;
               public char  e;
          }
          [MarshalAs( UnmanagedType.ByValTStr, SizeConst=6)]
          public string  rsvd3;
     }
}
0
 
LVL 10

Accepted Solution

by:
ptmcomp earned 125 total points
ID: 10839988
Move this line: " ZIP4_PARM zip4Parm = new ZIP4_PARM(); " before the for-loop and make sure that all fields are reset. (I don't know if it really changes something)

Every memory copy / reallocation should be avoided.

May be explicit ANSI conversion is faster:
   byte[] bytes = new bytes(51);  // you should not reallocate this but keey the array and reuse it!
   ASCIIEncoding ascii = new ASCIIEncoding();
   int len = iadl.Lengh;
   if (iadl.Length > 51)
   {
        len = 51;
   }
   bytes.Clear();
   ascii.GetBytes(iadl1, 0, len, bytes, 0);

If this helps then replace the strings in the struct by byte arrays and move the converting code above in a function (e.g. ToASCII(string text, byte[] bytes, int maxLength))

Use a Windows.Forms.Timer to update the statusbar panel. If you're updating too often it just consumes too much CPU time. I would also run the loop in a seperate thead and then call Sleep(0) where you call DoEvents.
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article introduced a TextBox that supports transparent background.   Introduction TextBox is the most widely used control component in GUI design. Most GUI controls do not support transparent background and more or less do not have the…
Introduction Hi all and welcome to my first article on Experts Exchange. A while ago, someone asked me if i could do some tutorials on object oriented programming. I decided to do them on C#. Now you may ask me, why's that? Well, one of the re…
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
Established in 1997, Technology Architects has become one of the most reputable technology solutions companies in the country. TA have been providing businesses with cost effective state-of-the-art solutions and unparalleled service that is designed…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question