• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 4277
  • Last Modified:

RegEx to remove invalid XML characters

I am storing data collected on a website as xml in ms sql 2005.

some characters are not allowed and I recieve "XML parsing: line 3, character 27, illegal xml character"
I believe what happens is people type up information in MS word or a similar program that uses extended characters such as curly quotes.  When I find an new illegal character I try and replace it.  But doing each illegal character seems clunkly.  I'm assuming there has to be a RegEx solution here.

Can anyone give me a vb.net example of a RegEx solution that will strip out all non legal XML characters as defined by MS SQL 2005.

1 Solution
Here is the exact link, which talks about how to parse & remove invalid characters                                             in XML File using C#. May be you can convert this into VB.NET using some convert tool.

dbashley1Author Commented:

parsing an error message isn't really the solution I'm looking for.  The error message could change with the next version of .net or other senarios that I'd be trying to anticipate.

I'd really like a RegEx solution if that is possible.

Balaji Ramesh has published a code that does exactly this. You can get the regular expression that he used:
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

Tackle projects and never again get stuck behind a technical roadblock.
Join Now