Solved

REPLACE

Posted on 2004-09-16
8
587 Views
Last Modified: 2010-04-15
Hi Experts,

I have data coming in ... but it has non-ascii characters in some places. How do I get rid of those or replace with ASCII equivalent ones if
it is possibly ?

Thank you.
0
Comment
Question by:fpoyavo
  • 2
  • 2
  • 2
  • +2
8 Comments
 
LVL 7

Expert Comment

by:CJCraft
ID: 12080581
The following code snippet should give you the idea.

The idea is go through the string and see if it ascii character value in in range of values you consider valid.
You may need to post more details before you get the answer you are after.

byte[] bScrubbed = new byte[(int)dwBytesRead];
int nPos = 0;

// Scrub the non-ascii characters
for (int i = 0; i < (int)dwBytesRead; i ++)
{
   if (((int)bData[i] > 19) && ((int)bData[i] < 125))
   {
      bScrubbed[nPos] = bData[i];
      nPos++;
   }
}
0
 
LVL 37

Expert Comment

by:gregoryyoung
ID: 12080840
you can do this using the ASCII encoding class ... I can give you specific code to handle the case if you can tell me what encoding you are coming from (i.e. UTF 16) ...
0
 
LVL 1

Author Comment

by:fpoyavo
ID: 12083120
Gregory,

Yep. Its UTF 16.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 5

Expert Comment

by:tzxie2000
ID: 12083241
suggest the data you receive is correct data but in UTF16 and it saved in s[]

char [] s;

//some code to receive data in s

Encoder UTFEncoder = Encoding.UTF16.GetEncoder();

int byteCount = UTFEncoder.GetByteCount(s, 0, s.Length, true);
Byte[] bytes = new Byte[byteCount];
int bytesEncodedCount = UTFEncoder.GetBytes(s, 0,s.Length, bytes, 0, true);
//ok it change to UTF16
// bytesEncodedCount is the real changed bytes number
 Console.WriteLine("{0} bytes used to encode characters.", bytesEncodedCount );
//show the encoded bytes
Console.Write("Encoded bytes: ");
  foreach (Byte b in s) {
        Console.Write("[{0}]", b);
   }





UTFEncoder


0
 
LVL 37

Expert Comment

by:gregoryyoung
ID: 12084127
once you have them in a byte array as displayed by tzxie2000 you can then get them into an ASCII string by using the ASCII encoder object .GetString() method.
note that null characters cause issues with strings in .net so they should be removed prior to this process (C style strings)
0
 
LVL 10

Accepted Solution

by:
ptmcomp earned 500 total points
ID: 12091194
string result = Regex.Replace(input, @"[^\x20-\x7e]", "");  // removes all chars that are not in the range 0x20 - 0x7e
0
 
LVL 1

Author Comment

by:fpoyavo
ID: 12094093
ptmcomp,

You are good. How about to replace using regex any special character with \\special character ?
Example :  $    to   \\$

Thank you.
0
 
LVL 10

Expert Comment

by:ptmcomp
ID: 12095124
I'm not sure if I understand your question.
If you want to replace "$" by "\\$" and let's say "@" by "\\@" then you could use this:  
string result = Regex.Replace(input, @"[$@]", @"\\$0");
Note: "$0" stands for the string matched by the expression
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Article by: Ivo
C# And Nullable Types Since 2.0 C# has Nullable(T) Generic Structure. The idea behind is to allow value type objects to have null values just like reference types have. This concerns scenarios where not all data sources have values (like a databa…
This article is for Object-Oriented Programming (OOP) beginners. An Interface contains declarations of events, indexers, methods and/or properties. Any class which implements the Interface should provide the concrete implementation for each Inter…
In an interesting question (https://www.experts-exchange.com/questions/29008360/) here at Experts Exchange, a member asked how to split a single image into multiple images. The primary usage for this is to place many photographs on a flatbed scanner…
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question