HTML processing

Hi,
I would like to be able to do the following:
Download websites and then remove all external links from site ( including banner links ) and then have a "viewer" browser to browse the website offline. The "viewer" can be any browser, doesn't have to be a specially developed browser. I would also like to have any flash and java components stay in tact after links are removed.
So I would basically need some perl script to run through the html pages and look for the external links and remove them.
LVL 17
psimationAsked:
Who is Participating?
 
DABOMBCommented:
the banner is still called by an <IMG SRC> tag, flash is <EMB SRC> links are <A HREF> the links are just underlying on the banner.
0
 
DABOMBCommented:
so you want links removed, but flash and java to stay, how about the images? if you get rid of the banners it will kill images also 99% of the time.

--Dabomb
0
 
psimationAuthor Commented:
Banners aren't really a problem, if you just get rid of the links, the banner's image should still be intact?? It just won't link right?
0
 
psimationAuthor Commented:
OK, I'm going to accept DABOMB's suggestion, but for the record and for PAQ's , it did not solve my problem; I'm giving up on this.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.