Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1089
  • Last Modified:

Regular Expression to match Xml tags

I am writing a webDAV query which needs to use a property which is in hex (i.e. beings 0x)

This is not allowed in Xml elements.  Therefore I need to replace the 0x with x and vice versa as the information goes in and out of an Xml document.

I am looking for a regular expression that will do the replacement of 0x to x.  However it must only match the elements in hex, not data.

So basically

<proptag:0x823D0003>5</proptag:0x823D0003>
would need to become
<proptag:x823D0003>5</proptag:x823D0003>

<mapirecurring:0x00008223 dt:dt="boolean">1</mapirecurring:0x00008223>
would need to become
<mapirecurring:x00008223 dt:dt="boolean">1</mapirecurring:x00008223>

Note that the namespace prefix can vary, the actual hex value can vary, and there could be additional attributes like in the above dt:dt="boolean"

A regular expression that could be used to do this replacement (ideally that will work in ASP.NEt with C#) would be highly appreciated.  I have linked to this question from the C# area as well.
0
mrichmon
Asked:
mrichmon
  • 8
  • 6
1 Solution
 
Harisha M GCommented:
Hi,

Find: "(</?\w+:)0(x\d+)"
Replace with: $1$2

C#:

RegEx.Replace(yourString, @"(</?\w+:)0(x\d+)", "$1$2",
    RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline)


Hope that helps


---
Harish
0
 
mrichmonAuthor Commented:
Thanks for the comments, but it doesn't work at all.  There are several problems one of which I could have told you without even testing (but I did test)

1) You only look for digits after the 0x, but it is a hex number which means that there could be 0-9 or A-F or a-f
2) Even assuming that my hex number was all digits it doesn't work.

I tested like this:
string test = "<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>"
Response.Write(Response.Write("Regex results: " + Regex.Replace("test", @"(</?\w+:)0(x\d+)", "$1$2", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline));

And the result was:

Regex results: 1

So basically it turned my whole string into the number 1.  Not good.

Also can you explain the $1$2 notation?  I am guessin ghte problem is there, but don't know what it is to test.
0
 
Harisha M GCommented:
$1$2 is same as \1\2

If you put "test" inside quotes, what should it search ?                      V

Response.Write(Response.Write("Regex results: " + Regex.Replace(test, @"(</?\w+:)0(x[\da-f]+)", "$1$2", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline));

Tested
<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>
Returns
<mapirecurring:x00008223 dt:dt=\"boolean\">1</mapirecurring:x00008223>

Also note the changed regex.
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
mrichmonAuthor Commented:
If you put test in quotes it would search the string "test" - which is not what I want, but that was just a typing error when posting here to forum.

I double checked and my code is correct - no quotes around test.  test was the variable containing the string as I showed.

I still don't understand this: $1$2 is same as \1\2

Can you explain what it does?

However, I just tried you new expression and it still doesn't work - which doesn't suprise me since you only acocunted for the a-f - which I knew how to do, but not the fundamental problem.

<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>
Returns
1

<mapirecurring:0x00008223 dt:dt=\"boolean\">Joe</mapirecurring:0x00008223>
Returns
Joe

So you are basically stripping out my entire xml tags and getting only the inner text.
0
 
Harisha M GCommented:
$1 means the first captured group, and $2 means the second.

Try \1\2 instead of $1$2 and see whether that corrects the problem
0
 
Harisha M GCommented:
Captured group is the match that occurs inside the parenthesis.

So,

$1 = \1 = (</?\w+:)

And

$2 = \2 = (x[\da-f]+)
0
 
mrichmonAuthor Commented:
When I try \1\2 I get a compilation error:  Compiler Error Message: CS1009: Unrecognized escape sequence

If I escpae the to "\\1\\2" or even @"\1\2"I get:

\1\2 dt:dt="boolean">Joe\1\2>
0
 
Harisha M GCommented:
http://www.fileformat.info/tool/regex.htm

Put appropriate values, and see whether it works for your various values.. I am not that good in ASP.NET (However I know C#)
0
 
mrichmonAuthor Commented:
Okay figured out the problem.  It was returning the correct results, just not displaying them to the screen.  My fault.  So I think it should work.

Thanks.
0
 
Harisha M GCommented:
Why "B" ? Did it not solve your problem ?

Anyways, glad to help
0
 
mrichmonAuthor Commented:
Whoops. meant to hit A.

I will fix it.
0
 
Harisha M GCommented:
Hey, didn't know you are a mod/page editor  !!

Which TA ?
0
 
mrichmonAuthor Commented:
I'm the primary PE in a bunch of small ones... Microsoft Project, EAI, SAP, etc
0
 
Harisha M GCommented:
Ah, I see :)

Thanks for the grade correction !
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 8
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now