Solved

HTML processing

Posted on 2002-04-18
4
138 Views
Last Modified: 2010-03-05
Hi,
I would like to be able to do the following:
Download websites and then remove all external links from site ( including banner links ) and then have a "viewer" browser to browse the website offline. The "viewer" can be any browser, doesn't have to be a specially developed browser. I would also like to have any flash and java components stay in tact after links are removed.
So I would basically need some perl script to run through the html pages and look for the external links and remove them.
0
Comment
Question by:psimation
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 3

Expert Comment

by:DABOMB
ID: 6957361
so you want links removed, but flash and java to stay, how about the images? if you get rid of the banners it will kill images also 99% of the time.

--Dabomb
0
 
LVL 17

Author Comment

by:psimation
ID: 6957553
Banners aren't really a problem, if you just get rid of the links, the banner's image should still be intact?? It just won't link right?
0
 
LVL 3

Accepted Solution

by:
DABOMB earned 50 total points
ID: 6958121
the banner is still called by an <IMG SRC> tag, flash is <EMB SRC> links are <A HREF> the links are just underlying on the banner.
0
 
LVL 17

Author Comment

by:psimation
ID: 7213675
OK, I'm going to accept DABOMB's suggestion, but for the record and for PAQ's , it did not solve my problem; I'm giving up on this.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question