Solved

Retrieve data from external websites using perl or application

Posted on 2004-08-09
5
207 Views
Last Modified: 2013-12-25
I need to get data from a very long list of sites (over 8000) and write the resulting html code from each site to seperate text files.  They have similar urls and content.

I am looking for the best way to do this, be it with perl or with a third party application.  Any help would be greatly appreciated.
0
Comment
Question by:Igiwwa
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 48

Accepted Solution

by:
Tintin earned 84 total points
ID: 11757539
Let's make the following assumptions.

1.  The list of sites (URL's) is in a plain text file.
2.  The output text file will have sequential names (as you haven't specified what format)

then

#!/usr/bin/perl
use strict;
use LWP::Simple;
use File::Basename;

my $list = '/path/to/list/of/sites.txt';
my $outputdir = ' /path/to/outputdir';

open LIST, $list or die "Can not open $list $!\n";

while (<LIST>) {
  chomp;
  my $site=$_;
  my $file=$outputdir . basename($site);
  getstore($site,$file);
}


 
0
 
LVL 4

Assisted Solution

by:alikoank
alikoank earned 83 total points
ID: 11760255
there are already several applications doing this
take a look at XMLTV

http://membled.com/work/apps/xmltv/

or plucker

http://www.plkr.org/
0
 
LVL 51

Assisted Solution

by:ahoffmann
ahoffmann earned 83 total points
ID: 11771778
assuming your URLs in a file, one per line:

wget -i file-withURLs
0

Featured Post

Setting up LaraDock for Laravel

Learn how to set up LaraDock in a Laravel project - LaraDock gives us an easy way to run a Laravel application using Docker in a single command.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It is becoming increasingly popular to have a front-page slider on a web site. Nearly every TV website,  magazine or online news has one on their site, and even some e-commerce sites have one. Today you can use sliders with Joomla, WordPress or …
In threads here at EE, each comment has a unique Identifier (ID). It is easy to get the full path for an ID via the right-click context menu. However, we often want to post a short link within a thread rather than the full link. This article shows a…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

630 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question