Solved

Find and Replace (or Convert) Non-Ascii Characters In A String

Posted on 2006-07-18
4
15,403 Views
Last Modified: 2008-07-17
Hi everyone.  I'm definately not a VB.NET or developer pro, but I wanted to know if anyone has a function that removes non-ascii characters from a string/CSV readline.  

I'm trying to find something that finds non-ascii characters and replace them with a space or attempt to convert them to ascii (I don't know if this is possible).
0
Comment
Question by:endrec
  • 3
4 Comments
 
LVL 62

Expert Comment

by:Fernando Soto
ID: 17131074
Hi endrec;

When you say non-ascii characters what do you mean?

For example the ASCII character set defines all characters between &H0 through &HFF, 0 - 255 decimal, but not all are printable. Which characters do you want to replace?

Fernando
0
 
LVL 62

Expert Comment

by:Fernando Soto
ID: 17131543
Hi endrec;

This example code will remove all non printable characters from the input string and replace them with a space character.

Imports System.Text.RegularExpressions

    ' Class level variable
    Private re As New Regex("[\x00-\x1F\x7F-\xFF]+", RegexOptions.Compiled)


    Dim input As String ' String that will be striped of all non printable characters.
    input = re.Replace(input, " ")

Input string should now have only printable characters in it.

Fernando
0
 

Author Comment

by:endrec
ID: 17134214
How would I remove non-standard ASCII characters (e.g. any of those characters in the extended ascii set and any non-printable characters)?
0
 
LVL 62

Accepted Solution

by:
Fernando Soto earned 400 total points
ID: 17138119
Hi endrec;

The above sample code will do that already. The Regex string pattern, "[\x00-\x1F\x7F-\xFF]+", does the following.

Where:
    [ ... ]        Mark a character class and will match any single character in the class.
    +             Quantifier, matches 1 or more of the previous character
    \x00-\x1F A Hex range of valid characters in the class. This is all characters from the begining of the ASCII set to
                  the 31st character which are all control characters.
    \x7F-\xFF Range of characters which are the ASCII extended character set.

This statement :

    input = re.Replace(input, " ")

Will take the string input and replace any of the characters found in the Regex pattern and replace it with a space character.

Fernando
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction When many people think of the WebBrowser (http://msdn.microsoft.com/en-us/library/2te2y1x6%28v=VS.85%29.aspx) control, they immediately think of a control which allows the viewing and navigation of web pages. While this is true, it's a…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
This tutorial gives a high-level tour of the interface of Marketo (a marketing automation tool to help businesses track and engage prospective customers and drive them to purchase). You will see the main areas including Marketing Activities, Design …
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

24 Experts available now in Live!

Get 1:1 Help Now