Solved

Change Tags inside HTML document programmatically

Posted on 2010-11-22
3
533 Views
Last Modified: 2013-12-17
Hi,

I need a way for to change some tags of HTML documents.
Eg.:
Load document
scan for <a>
- if found anchor read href and get URL, http://urltomyside.com/
- change http://urltomyside.com/ to http://urltoanyotherside.com
save document.

Any solution for?

Thanks

Andre
0
Comment
Question by:andre72
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 34188543
Where are you doing the changing? Within code (i.e. during runtime) or within the editor (i.e. during design time)?
0
 

Author Comment

by:andre72
ID: 34188670
I need to change it at runtime so the whole HTML content will be inside a string or stream ...
0
 
LVL 75

Accepted Solution

by:
käµfm³d   👽 earned 500 total points
ID: 34190162
I would think there would be some fancy class to work with a DOM document, but I don't know what it is off the top of my head, so I must fall back on regular expressions. Here is a pattern that should do the job, and an explanation of what it means:
source = System.Text.RegularExpressions.Regex.Replace(source, @"(<a\s+[^>]*href=[""']?)http://urltomyside\.com/", "$1http://urltoanyotherside.com", System.Text.RegularExpressions.RegexOptions.IgnoreCase);


// ( ... )                  -  capturing parentheses
// <a                       -  find a literal "<a"
// \s+                      -  find one or more ( + ) whitespace ( \s ) characters
// [^>]*                    -  find zero or more ( * ) of any character NOT ( [^ ...] ) a closing bracket ( > )
// href=                    -  find a literal "href="
// [""']?                   -  find zero or one ( ? ) of either a double- or single-quote ( ["'] ); there are two double-quotes because it has to be escaped for C#
// http://urltomyside\.com  - find the url; Note, the dot ( . ) has to be escaped ( \. ) for the pattern because it is a special character in regex


//  In the replace, you put the replacement URL as normal
//  (i.e. no special characters); however, we inclue
//  $1 at the beginning of it so that the text we captured
//  with the parentheses described above is inserted with
//  the replacement URL. If we don't include the $1, then
//  you will erase the "<a>" up to where the old URL is found.

Open in new window

0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article discusses four methods for overlaying images in a container on a web page
When crafting your “Why Us” page, there are a plethora of pitfalls to avoid. Follow these five tips, and you’ll be well on your way to creating an effective page.
In this tutorial viewers will learn how to style elements, such a divs, with a "drop shadow" effect using the CSS box-shadow property Start with a normal styled element, such as a div.: In the element's style, type the box shadow property: "box-shad…
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).
Suggested Courses

628 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question