Solved

regex in c#

Posted on 2012-03-12
8
307 Views
Last Modified: 2012-03-16
I am trying to remove certain html tags and save the value.

ex) <customerid><style face="normal" font="default">d33333</style></customerid>


There are some other tags I want to remove. <tag> has some attributes.
<tag att='dfdff" att2="fdfdf">

Can you help?
0
Comment
Question by:dkim18
8 Comments
 
LVL 42

Expert Comment

by:sedgwick
ID: 37711619
can u post the html?
which data you need exactly?
0
 
LVL 35

Assisted Solution

by:Terry Woods
Terry Woods earned 50 total points
ID: 37711960
You can only use a regex if your tags aren't nested.

eg with the following case we would need to remove the 2nd </div> tag without removing the first one if we were only targeting div tags with atttribute att="dfdff":

<div att="dfdff" att2="fdfdf"><div att="somethingelse">content</div>other content</div>

Working out that the 2nd </div> needs to be removed but not the first is a task for a parser, not a regex. If you are happy with the limitation that tags can't be nested, then I should be able to provide a regex. Let me know.
0
 
LVL 29

Assisted Solution

by:anarki_jimbel
anarki_jimbel earned 50 total points
ID: 37712127
Is your html text big enough?
In other words, are you sure that regex is a right solution? Unfortunately regex is known to have pretty bad performance... (e.g., http://www.codinghorror.com/blog/2006/01/regex-performance.html).
0
Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

 
LVL 75

Assisted Solution

by:käµfm³d 👽
käµfm³d   👽 earned 50 total points
ID: 37712136
Unfortunately regex is known to have pretty bad performance...
That depends, I think, on how you structure the regex. The "catastrophic backtracking" referenced in the article would be an example of a poorly-designed regex.
0
 
LVL 10

Accepted Solution

by:
pfrancois earned 350 total points
ID: 37713959
The way to process files with this kind of structure is with the XPath libraries. HTML can be easily converted into XML files, and then parsed in the way you want.

See several examples of C# and .NET here: http://www.java2s.com/Tutorial/CSharp/0540__XML/0380__XmlPathNavigator.htm
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37715591
HTML can be easily converted into XML files, and then parsed in the way you want.
...provided the HTML is actually valid XML (structurally).
0
 
LVL 10

Expert Comment

by:pfrancois
ID: 37715679
@kaufman: If the HTML is valid XML (XHTML), you don't need to convert it. My statement is that valid HTML can be converted into XHTML, which is valid XML.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37715766
And I agree, but your last post doesn't say "valid" HTML  = )
0

Featured Post

How our DevOps Teams Maximize Uptime

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us. Read the use case whitepaper.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This document covers how to connect to SQL Server and browse its contents.  It is meant for those new to Visual Studio and/or working with Microsoft SQL Server.  It is not a guide to building SQL Server database connections in your code.  This is mo…
Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
Email security requires an ever evolving service that stays up to date with counter-evolving threats. The Email Laundry perform Research and Development to ensure their email security service evolves faster than cyber criminals. We apply our Threat…

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question