Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

How can I determine whether a given string is a single HTML/XML token?

Posted on 2012-09-17
5
Medium Priority
?
390 Views
Last Modified: 2012-09-23
Hi Guys,

I am trying to confirm whether or not a string token is in and of itself an HTML or XML token.  To match it must start with '<', and with '>' and have zero or more slash characters and one or more additional characters.   There must be no additional '<' or '>' characters within the string.  Examples of matching strings:

<b>
<BR>
<html xmlns="http://www.w3.org/1999/xhtml" >

I have done this so far using the following regex:
@"^<\/*.>$"

Open in new window

Unfortunately, whilst this is generally OK it does not match the edge cases where the string starts and ends with a tag e.g.:

<b></b>
<b>This text will be displayed in bold</b>

So, how can I cause the regex to match only a single tag, and return no match if there is more than one tag in the string?

I am not necessarily stuck on using regex if there is a better alternative suggestion...

Chris Bray
0
Comment
Question by:chrisbray
  • 4
5 Comments
 
LVL 19

Expert Comment

by:Bardobrave
ID: 38405166
What about something like this?

@"^<\/*.>(<\/*.>)*$"

This way you add the posibility that there is an additional closing tag after your current configuration.
0
 
LVL 3

Author Comment

by:chrisbray
ID: 38406003
Hi Bardobrave,

That solves one of the edge cases, but not the other.  This still fails the test by returning a match when it shouldn't:

<b></b>

However, it does not provide a match for this one:

<BR><BR>

What makes it worse is that it breaks one of the working ones:

<BR>

This is a valid tag, but is reported as not matching when using your regex.

Chris Bray
0
 
LVL 3

Author Comment

by:chrisbray
ID: 38406080
I have found an issue in my starting regex, which should have a + to match 1 or more other characters - it was only matching a single before so <BR> and long tags did not match. Here is the new working regex APART from the edge cases reported previously:

@"#^<\/*.+>$"

To test for yourself if a proposed regex is working, these are the failing edge cases:

<b>test</b>
<b></b>
<BR><BR>

In each case the problem is that the string opens and closes with a tag.  If the string does not close with a tag it works fine.

Chris Bray
0
 
LVL 3

Accepted Solution

by:
chrisbray earned 0 total points
ID: 38411466
In the end, I gave up on the regex and used string handling and  a little Linq to provide the answer:

return str.Length > 2 && str.StartsWith("<") && str.EndsWith(">") 
&& str.Count(c => c == '<') == 1 &&  str.Count(c => c == '>') == 1;

Open in new window

This meets all the tests devised whilst being pretty quick in normal usage.  I hope that this is helpful to someone else faced with a similar issue.

Chris Bray
0
 
LVL 3

Author Closing Comment

by:chrisbray
ID: 38426044
No solution of any kind was forthcoming, and my experiments with regex were unsuccessful in eliminating the edge cases, so I created an answer of my own.
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many of us here at EE write code. Many of us write exceptional code; just as many of us write exception-prone code. As we all should know, exceptions are a mechanism for handling errors which are typically out of our control. From database errors, t…
The article shows the basic steps of integrating an HTML theme template into an ASP.NET MVC project
this video summaries big data hadoop online training demo (http://onlineitguru.com/big-data-hadoop-online-training-placement.html) , and covers basics in big data hadoop .
In a question here at Experts Exchange (https://www.experts-exchange.com/questions/29062564/Adobe-acrobat-reader-DC.html), a member asked how to create a signature in Adobe Acrobat Reader DC (the free Reader product, not the paid, full Acrobat produ…
Suggested Courses
Course of the Month21 days, 5 hours left to enroll

810 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question