Solved

HTML processing

Posted on 2002-04-18
4
137 Views
Last Modified: 2010-03-05
Hi,
I would like to be able to do the following:
Download websites and then remove all external links from site ( including banner links ) and then have a "viewer" browser to browse the website offline. The "viewer" can be any browser, doesn't have to be a specially developed browser. I would also like to have any flash and java components stay in tact after links are removed.
So I would basically need some perl script to run through the html pages and look for the external links and remove them.
0
Comment
Question by:psimation
  • 2
  • 2
4 Comments
 
LVL 3

Expert Comment

by:DABOMB
ID: 6957361
so you want links removed, but flash and java to stay, how about the images? if you get rid of the banners it will kill images also 99% of the time.

--Dabomb
0
 
LVL 17

Author Comment

by:psimation
ID: 6957553
Banners aren't really a problem, if you just get rid of the links, the banner's image should still be intact?? It just won't link right?
0
 
LVL 3

Accepted Solution

by:
DABOMB earned 50 total points
ID: 6958121
the banner is still called by an <IMG SRC> tag, flash is <EMB SRC> links are <A HREF> the links are just underlying on the banner.
0
 
LVL 17

Author Comment

by:psimation
ID: 7213675
OK, I'm going to accept DABOMB's suggestion, but for the record and for PAQ's , it did not solve my problem; I'm giving up on this.
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Port 80 requests 16 100
Add additional column to .csv using Perl. 8 155
Parse csv file and generate graphs in HTML in bash 8 240
read an xml file in perl 2 50
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question