Solved

Regular Expression to match Xml tags

Posted on 2006-06-27
14
1,055 Views
Last Modified: 2013-11-19
I am writing a webDAV query which needs to use a property which is in hex (i.e. beings 0x)

This is not allowed in Xml elements.  Therefore I need to replace the 0x with x and vice versa as the information goes in and out of an Xml document.

I am looking for a regular expression that will do the replacement of 0x to x.  However it must only match the elements in hex, not data.

So basically

<proptag:0x823D0003>5</proptag:0x823D0003>
would need to become
<proptag:x823D0003>5</proptag:x823D0003>

<mapirecurring:0x00008223 dt:dt="boolean">1</mapirecurring:0x00008223>
would need to become
<mapirecurring:x00008223 dt:dt="boolean">1</mapirecurring:x00008223>

Note that the namespace prefix can vary, the actual hex value can vary, and there could be additional attributes like in the above dt:dt="boolean"

A regular expression that could be used to do this replacement (ideally that will work in ASP.NEt with C#) would be highly appreciated.  I have linked to this question from the C# area as well.
0
Comment
Question by:mrichmon
  • 8
  • 6
14 Comments
 
LVL 37

Expert Comment

by:Harisha M G
Comment Utility
Hi,

Find: "(</?\w+:)0(x\d+)"
Replace with: $1$2

C#:

RegEx.Replace(yourString, @"(</?\w+:)0(x\d+)", "$1$2",
    RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline)


Hope that helps


---
Harish
0
 
LVL 35

Author Comment

by:mrichmon
Comment Utility
Thanks for the comments, but it doesn't work at all.  There are several problems one of which I could have told you without even testing (but I did test)

1) You only look for digits after the 0x, but it is a hex number which means that there could be 0-9 or A-F or a-f
2) Even assuming that my hex number was all digits it doesn't work.

I tested like this:
string test = "<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>"
Response.Write(Response.Write("Regex results: " + Regex.Replace("test", @"(</?\w+:)0(x\d+)", "$1$2", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline));

And the result was:

Regex results: 1

So basically it turned my whole string into the number 1.  Not good.

Also can you explain the $1$2 notation?  I am guessin ghte problem is there, but don't know what it is to test.
0
 
LVL 37

Accepted Solution

by:
Harisha M G earned 500 total points
Comment Utility
$1$2 is same as \1\2

If you put "test" inside quotes, what should it search ?                      V

Response.Write(Response.Write("Regex results: " + Regex.Replace(test, @"(</?\w+:)0(x[\da-f]+)", "$1$2", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline));

Tested
<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>
Returns
<mapirecurring:x00008223 dt:dt=\"boolean\">1</mapirecurring:x00008223>

Also note the changed regex.
0
 
LVL 35

Author Comment

by:mrichmon
Comment Utility
If you put test in quotes it would search the string "test" - which is not what I want, but that was just a typing error when posting here to forum.

I double checked and my code is correct - no quotes around test.  test was the variable containing the string as I showed.

I still don't understand this: $1$2 is same as \1\2

Can you explain what it does?

However, I just tried you new expression and it still doesn't work - which doesn't suprise me since you only acocunted for the a-f - which I knew how to do, but not the fundamental problem.

<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>
Returns
1

<mapirecurring:0x00008223 dt:dt=\"boolean\">Joe</mapirecurring:0x00008223>
Returns
Joe

So you are basically stripping out my entire xml tags and getting only the inner text.
0
 
LVL 37

Expert Comment

by:Harisha M G
Comment Utility
$1 means the first captured group, and $2 means the second.

Try \1\2 instead of $1$2 and see whether that corrects the problem
0
 
LVL 37

Expert Comment

by:Harisha M G
Comment Utility
Captured group is the match that occurs inside the parenthesis.

So,

$1 = \1 = (</?\w+:)

And

$2 = \2 = (x[\da-f]+)
0
 
LVL 35

Author Comment

by:mrichmon
Comment Utility
When I try \1\2 I get a compilation error:  Compiler Error Message: CS1009: Unrecognized escape sequence

If I escpae the to "\\1\\2" or even @"\1\2"I get:

\1\2 dt:dt="boolean">Joe\1\2>
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 37

Expert Comment

by:Harisha M G
Comment Utility
http://www.fileformat.info/tool/regex.htm

Put appropriate values, and see whether it works for your various values.. I am not that good in ASP.NET (However I know C#)
0
 
LVL 35

Author Comment

by:mrichmon
Comment Utility
Okay figured out the problem.  It was returning the correct results, just not displaying them to the screen.  My fault.  So I think it should work.

Thanks.
0
 
LVL 37

Expert Comment

by:Harisha M G
Comment Utility
Why "B" ? Did it not solve your problem ?

Anyways, glad to help
0
 
LVL 35

Author Comment

by:mrichmon
Comment Utility
Whoops. meant to hit A.

I will fix it.
0
 
LVL 37

Expert Comment

by:Harisha M G
Comment Utility
Hey, didn't know you are a mod/page editor  !!

Which TA ?
0
 
LVL 35

Author Comment

by:mrichmon
Comment Utility
I'm the primary PE in a bunch of small ones... Microsoft Project, EAI, SAP, etc
0
 
LVL 37

Expert Comment

by:Harisha M G
Comment Utility
Ah, I see :)

Thanks for the grade correction !
0

Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Preface This is the third article about the EE Collaborative Login Project. A Better Website Login System (http://www.experts-exchange.com/A_2902.html) introduces the Login System and shows how to implement a login page. The EE Collaborative Logi…
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…
The viewer will learn how to count occurrences of each item in an array.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now