• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 592
  • Last Modified:

REPLACE

Hi Experts,

I have data coming in ... but it has non-ascii characters in some places. How do I get rid of those or replace with ASCII equivalent ones if
it is possibly ?

Thank you.
0
fpoyavo
Asked:
fpoyavo
  • 2
  • 2
  • 2
  • +2
1 Solution
 
CJCraftCommented:
The following code snippet should give you the idea.

The idea is go through the string and see if it ascii character value in in range of values you consider valid.
You may need to post more details before you get the answer you are after.

byte[] bScrubbed = new byte[(int)dwBytesRead];
int nPos = 0;

// Scrub the non-ascii characters
for (int i = 0; i < (int)dwBytesRead; i ++)
{
   if (((int)bData[i] > 19) && ((int)bData[i] < 125))
   {
      bScrubbed[nPos] = bData[i];
      nPos++;
   }
}
0
 
gregoryyoungCommented:
you can do this using the ASCII encoding class ... I can give you specific code to handle the case if you can tell me what encoding you are coming from (i.e. UTF 16) ...
0
 
fpoyavoAuthor Commented:
Gregory,

Yep. Its UTF 16.
0
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

 
tzxie2000Commented:
suggest the data you receive is correct data but in UTF16 and it saved in s[]

char [] s;

//some code to receive data in s

Encoder UTFEncoder = Encoding.UTF16.GetEncoder();

int byteCount = UTFEncoder.GetByteCount(s, 0, s.Length, true);
Byte[] bytes = new Byte[byteCount];
int bytesEncodedCount = UTFEncoder.GetBytes(s, 0,s.Length, bytes, 0, true);
//ok it change to UTF16
// bytesEncodedCount is the real changed bytes number
 Console.WriteLine("{0} bytes used to encode characters.", bytesEncodedCount );
//show the encoded bytes
Console.Write("Encoded bytes: ");
  foreach (Byte b in s) {
        Console.Write("[{0}]", b);
   }





UTFEncoder


0
 
gregoryyoungCommented:
once you have them in a byte array as displayed by tzxie2000 you can then get them into an ASCII string by using the ASCII encoder object .GetString() method.
note that null characters cause issues with strings in .net so they should be removed prior to this process (C style strings)
0
 
ptmcompCommented:
string result = Regex.Replace(input, @"[^\x20-\x7e]", "");  // removes all chars that are not in the range 0x20 - 0x7e
0
 
fpoyavoAuthor Commented:
ptmcomp,

You are good. How about to replace using regex any special character with \\special character ?
Example :  $    to   \\$

Thank you.
0
 
ptmcompCommented:
I'm not sure if I understand your question.
If you want to replace "$" by "\\$" and let's say "@" by "\\@" then you could use this:  
string result = Regex.Replace(input, @"[$@]", @"\\$0");
Note: "$0" stands for the string matched by the expression
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

  • 2
  • 2
  • 2
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now