Solved

Regular Expression to match Xml tags

Posted on 2006-06-27
14
1,066 Views
Last Modified: 2013-11-19
I am writing a webDAV query which needs to use a property which is in hex (i.e. beings 0x)

This is not allowed in Xml elements.  Therefore I need to replace the 0x with x and vice versa as the information goes in and out of an Xml document.

I am looking for a regular expression that will do the replacement of 0x to x.  However it must only match the elements in hex, not data.

So basically

<proptag:0x823D0003>5</proptag:0x823D0003>
would need to become
<proptag:x823D0003>5</proptag:x823D0003>

<mapirecurring:0x00008223 dt:dt="boolean">1</mapirecurring:0x00008223>
would need to become
<mapirecurring:x00008223 dt:dt="boolean">1</mapirecurring:x00008223>

Note that the namespace prefix can vary, the actual hex value can vary, and there could be additional attributes like in the above dt:dt="boolean"

A regular expression that could be used to do this replacement (ideally that will work in ASP.NEt with C#) would be highly appreciated.  I have linked to this question from the C# area as well.
0
Comment
Question by:mrichmon
  • 8
  • 6
14 Comments
 
LVL 37

Expert Comment

by:Harisha M G
ID: 16996764
Hi,

Find: "(</?\w+:)0(x\d+)"
Replace with: $1$2

C#:

RegEx.Replace(yourString, @"(</?\w+:)0(x\d+)", "$1$2",
    RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline)


Hope that helps


---
Harish
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17005745
Thanks for the comments, but it doesn't work at all.  There are several problems one of which I could have told you without even testing (but I did test)

1) You only look for digits after the 0x, but it is a hex number which means that there could be 0-9 or A-F or a-f
2) Even assuming that my hex number was all digits it doesn't work.

I tested like this:
string test = "<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>"
Response.Write(Response.Write("Regex results: " + Regex.Replace("test", @"(</?\w+:)0(x\d+)", "$1$2", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline));

And the result was:

Regex results: 1

So basically it turned my whole string into the number 1.  Not good.

Also can you explain the $1$2 notation?  I am guessin ghte problem is there, but don't know what it is to test.
0
 
LVL 37

Accepted Solution

by:
Harisha M G earned 500 total points
ID: 17005785
$1$2 is same as \1\2

If you put "test" inside quotes, what should it search ?                      V

Response.Write(Response.Write("Regex results: " + Regex.Replace(test, @"(</?\w+:)0(x[\da-f]+)", "$1$2", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline));

Tested
<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>
Returns
<mapirecurring:x00008223 dt:dt=\"boolean\">1</mapirecurring:x00008223>

Also note the changed regex.
0
MIM Survival Guide for Service Desk Managers

Major incidents can send mastered service desk processes into disorder. Systems and tools produce the data needed to resolve these incidents, but your challenge is getting that information to the right people fast. Check out the Survival Guide and begin bringing order to chaos.

 
LVL 35

Author Comment

by:mrichmon
ID: 17010472
If you put test in quotes it would search the string "test" - which is not what I want, but that was just a typing error when posting here to forum.

I double checked and my code is correct - no quotes around test.  test was the variable containing the string as I showed.

I still don't understand this: $1$2 is same as \1\2

Can you explain what it does?

However, I just tried you new expression and it still doesn't work - which doesn't suprise me since you only acocunted for the a-f - which I knew how to do, but not the fundamental problem.

<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>
Returns
1

<mapirecurring:0x00008223 dt:dt=\"boolean\">Joe</mapirecurring:0x00008223>
Returns
Joe

So you are basically stripping out my entire xml tags and getting only the inner text.
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17013703
$1 means the first captured group, and $2 means the second.

Try \1\2 instead of $1$2 and see whether that corrects the problem
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17013722
Captured group is the match that occurs inside the parenthesis.

So,

$1 = \1 = (</?\w+:)

And

$2 = \2 = (x[\da-f]+)
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17013805
When I try \1\2 I get a compilation error:  Compiler Error Message: CS1009: Unrecognized escape sequence

If I escpae the to "\\1\\2" or even @"\1\2"I get:

\1\2 dt:dt="boolean">Joe\1\2>
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17014048
http://www.fileformat.info/tool/regex.htm

Put appropriate values, and see whether it works for your various values.. I am not that good in ASP.NET (However I know C#)
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17014683
Okay figured out the problem.  It was returning the correct results, just not displaying them to the screen.  My fault.  So I think it should work.

Thanks.
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17014688
Why "B" ? Did it not solve your problem ?

Anyways, glad to help
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17014700
Whoops. meant to hit A.

I will fix it.
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17014729
Hey, didn't know you are a mod/page editor  !!

Which TA ?
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17014781
I'm the primary PE in a bunch of small ones... Microsoft Project, EAI, SAP, etc
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17014800
Ah, I see :)

Thanks for the grade correction !
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I will show you how to create a ASP.NET Captcha control without using any HTTP HANDELRS or what so ever. you can easily plug it into your web pages. For Example a = 2 + 3 (where 2 and 3 are 2 random numbers) Session("Answer") = 5 then we…
JavaScript has plenty of pieces of code people often just copy/paste from somewhere but never quite fully understand. Self-Executing functions are just one good example that I'll try to demystify here.
The viewer will learn how to count occurrences of each item in an array.
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question