Parse & Download HTML pages
Posted on 2002-06-19
I want to write a Perl script that does two things:
1. Parse a set of HTML pages in different directories recursively and extract all <A HREF=.....> </A> tags into separate text files while maintaining the directory structure. The text files have to reside in the same directories as the html pages. (Note that tags/links are to html pages).
2. Use the extracted tags from text files and download the html pages and save them to the respective directories.
I have managed to extract the tags but into a single file. The directory structure needs to be maintained. I would enter the home directory and the script should do the extraction and download recursively.
Appreciate your help.