Link to home
Start Free TrialLog in
Avatar of mmarksbury
mmarksbury

asked on

Link Conversion in C# (500 Pts)

I have thousands of HTML files that are going to be tied into a new system.  The new system uses IFrames to display the content in these HTML files.  The problem is that most of the these files have hundreds of links within each file, and the TARGET attribute varys from "_SELF" to "_TOP", etc.  

The HTML is read into a string using C#, and then outputted within a header/footer scenario inside the IFrames.  Ideally, I would like to use that string, do some sort of REGEX detection of those tags, correct them to have a TARGET="_PARENT" Attribute, and then perform the outputting to the IFrame.

All I need from you experts is a way to detect if a tag is a link, detect if it already has a a TARGET attribute, and make sure the TARGET attribute is set to "_Parent" before moving on to the next link in the string.

Looking forward to your answers!

Thanks in advance

Avatar of Fernando Soto
Fernando Soto
Flag of United States of America image

Hi mmarksbury;

I am no expert in HTML so I want to understand the request first.
1 - Detect if a tag is a link.
2 - Check to see if it has a TARGET
3 - If it has a TARGET make sure it is set to "_PARENT"

Question
1 - Are links HREF only?
2 - Does any other tags other then links have a TARGET attribute ?
3 - Could I search for the attribute TARGET and check to see if is set to "_PARENT" without looking for links?
Avatar of mmarksbury
mmarksbury

ASKER

You have the understanding correct.

Most likely, TARGET attributes will only apply to links, so it should be fine to only look for the TARGET attribute.
ASKER CERTIFIED SOLUTION
Avatar of Fernando Soto
Fernando Soto
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Works great, only if a link does not have a TARGET attribute, this code does not add one.  Any suggestions on how to do this?  I suspect you will have to do a match for a link tag, then check for the attribute and add it or change it depending on what is needed.
With your help, I made it the rest of the way . . . Following is the code used.

string stringToMatch = "Some Html, <a href=\"somepage.htm\" target=\"self\"><br /><b>Test</b>";
string NewString = string.Empty;

Regex LinksRegex = new Regex(@"<a\s+([^>]+)>([^<]+)</a>");
foreach(Match M in LinksRegex.Matches(stringToMatch))
{
     Regex TargetRegex = new Regex(@"TARGET\s*?=\s*?""(?<Attrib>.+?)""", RegexOptions.IgnoreCase | RegexOptions.Singleline);
     Match TargetMatch = TargetRegex.Match(M.ToString());
     NewString = TargetRegex.Replace(NewString, TargetMatch.Value.Replace(TargetMatch.Groups["Attrib"].Value,"_top"));
}

Thanks.  Points awarded.