• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 15771
  • Last Modified:

Find and Replace (or Convert) Non-Ascii Characters In A String

Hi everyone.  I'm definately not a VB.NET or developer pro, but I wanted to know if anyone has a function that removes non-ascii characters from a string/CSV readline.  

I'm trying to find something that finds non-ascii characters and replace them with a space or attempt to convert them to ascii (I don't know if this is possible).
0
endrec
Asked:
endrec
  • 3
1 Solution
 
Fernando SotoRetiredCommented:
Hi endrec;

When you say non-ascii characters what do you mean?

For example the ASCII character set defines all characters between &H0 through &HFF, 0 - 255 decimal, but not all are printable. Which characters do you want to replace?

Fernando
0
 
Fernando SotoRetiredCommented:
Hi endrec;

This example code will remove all non printable characters from the input string and replace them with a space character.

Imports System.Text.RegularExpressions

    ' Class level variable
    Private re As New Regex("[\x00-\x1F\x7F-\xFF]+", RegexOptions.Compiled)


    Dim input As String ' String that will be striped of all non printable characters.
    input = re.Replace(input, " ")

Input string should now have only printable characters in it.

Fernando
0
 
endrecAuthor Commented:
How would I remove non-standard ASCII characters (e.g. any of those characters in the extended ascii set and any non-printable characters)?
0
 
Fernando SotoRetiredCommented:
Hi endrec;

The above sample code will do that already. The Regex string pattern, "[\x00-\x1F\x7F-\xFF]+", does the following.

Where:
    [ ... ]        Mark a character class and will match any single character in the class.
    +             Quantifier, matches 1 or more of the previous character
    \x00-\x1F A Hex range of valid characters in the class. This is all characters from the begining of the ASCII set to
                  the 31st character which are all control characters.
    \x7F-\xFF Range of characters which are the ASCII extended character set.

This statement :

    input = re.Replace(input, " ")

Will take the string input and replace any of the characters found in the Regex pattern and replace it with a space character.

Fernando
0

Featured Post

2018 Annual Membership Survey

Here at Experts Exchange, we strive to give members the best experience. Help us improve the site by taking this survey today! (Bonus: Be entered to win a great tech prize for participating!)

  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now