Solved

Retrieve data from external websites using perl or application

Posted on 2004-08-09
5
195 Views
Last Modified: 2013-12-25
I need to get data from a very long list of sites (over 8000) and write the resulting html code from each site to seperate text files.  They have similar urls and content.

I am looking for the best way to do this, be it with perl or with a third party application.  Any help would be greatly appreciated.
0
Comment
Question by:Igiwwa
5 Comments
 
LVL 48

Accepted Solution

by:
Tintin earned 84 total points
ID: 11757539
Let's make the following assumptions.

1.  The list of sites (URL's) is in a plain text file.
2.  The output text file will have sequential names (as you haven't specified what format)

then

#!/usr/bin/perl
use strict;
use LWP::Simple;
use File::Basename;

my $list = '/path/to/list/of/sites.txt';
my $outputdir = ' /path/to/outputdir';

open LIST, $list or die "Can not open $list $!\n";

while (<LIST>) {
  chomp;
  my $site=$_;
  my $file=$outputdir . basename($site);
  getstore($site,$file);
}


 
0
 
LVL 4

Assisted Solution

by:alikoank
alikoank earned 83 total points
ID: 11760255
there are already several applications doing this
take a look at XMLTV

http://membled.com/work/apps/xmltv/

or plucker

http://www.plkr.org/
0
 
LVL 51

Assisted Solution

by:ahoffmann
ahoffmann earned 83 total points
ID: 11771778
assuming your URLs in a file, one per line:

wget -i file-withURLs
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

In this tutorial I will aim to show you how simple is making a small application in WhizBase, how to add, remove and update data in the DB. I will make a small address book application where you can add, browse, update and remove addresses. I wi…
Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now