• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 329
  • Last Modified:

REGEX: Finding and replacing hight order bit characters

Which REGEX expression would find and replace high order bit characters with a text character value?

Thank you.
0
RichardKline
Asked:
RichardKline
  • 4
  • 2
1 Solution
 
b0lsc0ttCommented:
What characters do you mean by "high order bit"?  What text characters would you replace them with?  What engine (i.e. language or technology) are you using for the regular expression?

bol
0
 
RichardKlineAuthor Commented:
I'm trying to capture non-text characters which are created in Word documents and then pasted into a web page:  For example:  Type the phrase "Example" in Word.  Note the funky character with which Word replaces your double-quote.  

Attempts to to save that character and others like it as pasted into a SQL server database produce an error.  I have no  reason to those types of characters and would prefer to substute them with something in the less than ASCII 127 range.

In the example above, I would replace it with the double-quote text character.  For other similiar characters I would replace with with the nearest text equivalent or with a simple space.

I'm using .NET framework 1.1 and 2.0  (VB).

Thank you.
0
 
b0lsc0ttCommented:
Thanks for clarifying.  That really helped.  I am not a .NET expert so it may have some builtin function that will make this a little easier.  I doubt it though.  The easy part is getting rid of the invalid characters.

The hard part will be the characters you want to replace.  I am afraid you will need to identify the character to replace and then specify the replacement to use.  If you can provide a list, you mentioned 2 characters (the open and close double quote) already, then I can help build with this.  However this part won't need an expression.  Using VB.NET's replace function (or its equivalent) will be best.

An expression will be helpful for those other invalid character that you wish to just delete.  Although I am not a VB.NET expert I know the expression below is good and you can probably use it like ...

Dim ResultString As String
Try
      ResultString = Regex.Replace(SubjectString, "[^\u0009-\u0127]", "")
Catch ex As ArgumentException
      'Syntax error in the regular expression
End Try

You would do this AFTER you replaced the characters you wanted to change.  It will delete all of the other "high order" characters so you don't want to run this first.

Let me know how this helps or if you have a question.

bol
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
b0lsc0ttCommented:
As far as a replace for the open and close double quotes Word uses you can use something like ...

str = Replace(Replace(str, Asc(147), Asc(34)), Asc(148), Asc(34))

The code above is based on my knowledge of vbscript so let me know if there is a problem using it in VB.NET and I can look into it.  Basically the open double quote is ascii character 147 and the close double quote is ascii character 148.  The normal double quote is character 34.

That code should at least give you an idea of how to do it and get you started. :)

bol
0
 
RichardKlineAuthor Commented:
Sir,
Your assistance is much appreciated!  This is exactly what I needed.
I've upped the point value to show my appeciation for your prompt and complete answer.
0
 
b0lsc0ttCommented:
Your welcome!  I'm glad I could help.  It was a very fun question.  Thanks for the grade and especially for the point increase.

Good luck with this and see you around. :)

bol
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now