Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Regular Expression to match Xml tags

Posted on 2006-06-27
14
Medium Priority
?
1,081 Views
Last Modified: 2013-11-19
I am writing a webDAV query which needs to use a property which is in hex (i.e. beings 0x)

This is not allowed in Xml elements.  Therefore I need to replace the 0x with x and vice versa as the information goes in and out of an Xml document.

I am looking for a regular expression that will do the replacement of 0x to x.  However it must only match the elements in hex, not data.

So basically

<proptag:0x823D0003>5</proptag:0x823D0003>
would need to become
<proptag:x823D0003>5</proptag:x823D0003>

<mapirecurring:0x00008223 dt:dt="boolean">1</mapirecurring:0x00008223>
would need to become
<mapirecurring:x00008223 dt:dt="boolean">1</mapirecurring:x00008223>

Note that the namespace prefix can vary, the actual hex value can vary, and there could be additional attributes like in the above dt:dt="boolean"

A regular expression that could be used to do this replacement (ideally that will work in ASP.NEt with C#) would be highly appreciated.  I have linked to this question from the C# area as well.
0
Comment
Question by:mrichmon
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 6
14 Comments
 
LVL 37

Expert Comment

by:Harisha M G
ID: 16996764
Hi,

Find: "(</?\w+:)0(x\d+)"
Replace with: $1$2

C#:

RegEx.Replace(yourString, @"(</?\w+:)0(x\d+)", "$1$2",
    RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline)


Hope that helps


---
Harish
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17005745
Thanks for the comments, but it doesn't work at all.  There are several problems one of which I could have told you without even testing (but I did test)

1) You only look for digits after the 0x, but it is a hex number which means that there could be 0-9 or A-F or a-f
2) Even assuming that my hex number was all digits it doesn't work.

I tested like this:
string test = "<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>"
Response.Write(Response.Write("Regex results: " + Regex.Replace("test", @"(</?\w+:)0(x\d+)", "$1$2", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline));

And the result was:

Regex results: 1

So basically it turned my whole string into the number 1.  Not good.

Also can you explain the $1$2 notation?  I am guessin ghte problem is there, but don't know what it is to test.
0
 
LVL 37

Accepted Solution

by:
Harisha M G earned 2000 total points
ID: 17005785
$1$2 is same as \1\2

If you put "test" inside quotes, what should it search ?                      V

Response.Write(Response.Write("Regex results: " + Regex.Replace(test, @"(</?\w+:)0(x[\da-f]+)", "$1$2", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Singleline));

Tested
<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>
Returns
<mapirecurring:x00008223 dt:dt=\"boolean\">1</mapirecurring:x00008223>

Also note the changed regex.
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 35

Author Comment

by:mrichmon
ID: 17010472
If you put test in quotes it would search the string "test" - which is not what I want, but that was just a typing error when posting here to forum.

I double checked and my code is correct - no quotes around test.  test was the variable containing the string as I showed.

I still don't understand this: $1$2 is same as \1\2

Can you explain what it does?

However, I just tried you new expression and it still doesn't work - which doesn't suprise me since you only acocunted for the a-f - which I knew how to do, but not the fundamental problem.

<mapirecurring:0x00008223 dt:dt=\"boolean\">1</mapirecurring:0x00008223>
Returns
1

<mapirecurring:0x00008223 dt:dt=\"boolean\">Joe</mapirecurring:0x00008223>
Returns
Joe

So you are basically stripping out my entire xml tags and getting only the inner text.
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17013703
$1 means the first captured group, and $2 means the second.

Try \1\2 instead of $1$2 and see whether that corrects the problem
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17013722
Captured group is the match that occurs inside the parenthesis.

So,

$1 = \1 = (</?\w+:)

And

$2 = \2 = (x[\da-f]+)
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17013805
When I try \1\2 I get a compilation error:  Compiler Error Message: CS1009: Unrecognized escape sequence

If I escpae the to "\\1\\2" or even @"\1\2"I get:

\1\2 dt:dt="boolean">Joe\1\2>
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17014048
http://www.fileformat.info/tool/regex.htm

Put appropriate values, and see whether it works for your various values.. I am not that good in ASP.NET (However I know C#)
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17014683
Okay figured out the problem.  It was returning the correct results, just not displaying them to the screen.  My fault.  So I think it should work.

Thanks.
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17014688
Why "B" ? Did it not solve your problem ?

Anyways, glad to help
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17014700
Whoops. meant to hit A.

I will fix it.
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17014729
Hey, didn't know you are a mod/page editor  !!

Which TA ?
0
 
LVL 35

Author Comment

by:mrichmon
ID: 17014781
I'm the primary PE in a bunch of small ones... Microsoft Project, EAI, SAP, etc
0
 
LVL 37

Expert Comment

by:Harisha M G
ID: 17014800
Ah, I see :)

Thanks for the grade correction !
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction Knockoutjs (Knockout) is a JavaScript framework (Model View ViewModel or MVVM framework).   The main ideology behind Knockout is to control from JavaScript how a page looks whilst creating an engaging user experience in the least …
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.
HTML5 has deprecated a few of the older ways of showing media as well as offering up a new way to create games and animations. Audio, video, and canvas are just a few of the adjustments made between XHTML and HTML5. As we learned in our last micr…

598 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question