?
Solved

Change Tags inside HTML document programmatically

Posted on 2010-11-22
3
Medium Priority
?
536 Views
Last Modified: 2013-12-17
Hi,

I need a way for to change some tags of HTML documents.
Eg.:
Load document
scan for <a>
- if found anchor read href and get URL, http://urltomyside.com/
- change http://urltomyside.com/ to http://urltoanyotherside.com
save document.

Any solution for?

Thanks

Andre
0
Comment
Question by:andre72
  • 2
3 Comments
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 34188543
Where are you doing the changing? Within code (i.e. during runtime) or within the editor (i.e. during design time)?
0
 

Author Comment

by:andre72
ID: 34188670
I need to change it at runtime so the whole HTML content will be inside a string or stream ...
0
 
LVL 75

Accepted Solution

by:
käµfm³d   👽 earned 2000 total points
ID: 34190162
I would think there would be some fancy class to work with a DOM document, but I don't know what it is off the top of my head, so I must fall back on regular expressions. Here is a pattern that should do the job, and an explanation of what it means:
source = System.Text.RegularExpressions.Regex.Replace(source, @"(<a\s+[^>]*href=[""']?)http://urltomyside\.com/", "$1http://urltoanyotherside.com", System.Text.RegularExpressions.RegexOptions.IgnoreCase);


// ( ... )                  -  capturing parentheses
// <a                       -  find a literal "<a"
// \s+                      -  find one or more ( + ) whitespace ( \s ) characters
// [^>]*                    -  find zero or more ( * ) of any character NOT ( [^ ...] ) a closing bracket ( > )
// href=                    -  find a literal "href="
// [""']?                   -  find zero or one ( ? ) of either a double- or single-quote ( ["'] ); there are two double-quotes because it has to be escaped for C#
// http://urltomyside\.com  - find the url; Note, the dot ( . ) has to be escaped ( \. ) for the pattern because it is a special character in regex


//  In the replace, you put the replacement URL as normal
//  (i.e. no special characters); however, we inclue
//  $1 at the beginning of it so that the text we captured
//  with the parentheses described above is inserted with
//  the replacement URL. If we don't include the $1, then
//  you will erase the "<a>" up to where the old URL is found.

Open in new window

0

Featured Post

Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Finding original email is quite difficult due to their duplicates. From this article, you will come to know why multiple duplicates of same emails appear and how to delete duplicate emails from Outlook securely and instantly while vital emails remai…
Simulator games are perfect for generating sample realistic data streams, especially for learning data analysis. It is even useful for demoing offerings such as Azure stream analytics, PowerBI etc.
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…
This video shows how to quickly and easily deploy an email signature for all users in Office 365 and prevent it from being added to replies and forwards. (the resulting signature is applied on the server level in Exchange Online) The email signat…
Suggested Courses

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question