Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Change Tags inside HTML document programmatically

Posted on 2010-11-22
3
Medium Priority
?
534 Views
Last Modified: 2013-12-17
Hi,

I need a way for to change some tags of HTML documents.
Eg.:
Load document
scan for <a>
- if found anchor read href and get URL, http://urltomyside.com/
- change http://urltomyside.com/ to http://urltoanyotherside.com
save document.

Any solution for?

Thanks

Andre
0
Comment
Question by:andre72
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 34188543
Where are you doing the changing? Within code (i.e. during runtime) or within the editor (i.e. during design time)?
0
 

Author Comment

by:andre72
ID: 34188670
I need to change it at runtime so the whole HTML content will be inside a string or stream ...
0
 
LVL 75

Accepted Solution

by:
käµfm³d   👽 earned 2000 total points
ID: 34190162
I would think there would be some fancy class to work with a DOM document, but I don't know what it is off the top of my head, so I must fall back on regular expressions. Here is a pattern that should do the job, and an explanation of what it means:
source = System.Text.RegularExpressions.Regex.Replace(source, @"(<a\s+[^>]*href=[""']?)http://urltomyside\.com/", "$1http://urltoanyotherside.com", System.Text.RegularExpressions.RegexOptions.IgnoreCase);


// ( ... )                  -  capturing parentheses
// <a                       -  find a literal "<a"
// \s+                      -  find one or more ( + ) whitespace ( \s ) characters
// [^>]*                    -  find zero or more ( * ) of any character NOT ( [^ ...] ) a closing bracket ( > )
// href=                    -  find a literal "href="
// [""']?                   -  find zero or one ( ? ) of either a double- or single-quote ( ["'] ); there are two double-quotes because it has to be escaped for C#
// http://urltomyside\.com  - find the url; Note, the dot ( . ) has to be escaped ( \. ) for the pattern because it is a special character in regex


//  In the replace, you put the replacement URL as normal
//  (i.e. no special characters); however, we inclue
//  $1 at the beginning of it so that the text we captured
//  with the parentheses described above is inserted with
//  the replacement URL. If we don't include the $1, then
//  you will erase the "<a>" up to where the old URL is found.

Open in new window

0

Featured Post

Python: Series & Data Frames With Pandas

Learn the basics of Python’s pandas library of series & data frames and how we can use these tools for data manipulation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When it comes to write a Context Sensitive Help (an online help that is obtained from a specific point in state of software to provide help with that state) ,  first we need to make the file that contains all topics, which are given exclusive IDs. …
This article discusses how to create an extensible mechanism for linked drop downs.
In this tutorial viewers will learn how to embed Flash content in a webpage using HTML5. Ensure your DOCTYPE declaration is set to HTML5: "<!DOCTYPE html>": Use the <object> tag to embed Flash content.: To specify that the object is Flash content, d…
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question