Link to home
Start Free TrialLog in
Avatar of jblandro
jblandro

asked on

Frontpage - "collect" headlines from newspapers

Hi!
I just wonder how to collect headlines (updated headlines) (5 latest headlines) from some of the newspapers in the country. I use Frontpage 98. Do i have to add a script, make a database??
Avatar of Mark Franz
Mark Franz
Flag of United States of America image

Check out this link, http://www.mgfic.com/getfile.pl

I use a small Perl script that grabs headlines from Isyndicate.com  If all you want to do is insert lines of code for headlines check out www.iSyndicate.com
Avatar of jblandro
jblandro

ASKER

I've checked it out. It looks good, but i can't get headlines from my own country - norway. Can anyone help me with that....
Does Norway have headlines?  ;-)  (just kidding...)

Check with the AP wire for your Country, or a national newspaper for headlines.  If they post news, I can swipe it.  CNN, BBC, ITN and The Norway Post all post headlines, what do you want to display?

Here are some links;

http://www.dagbladet.no/
http://www.dagogtid.no/

And this one looks like a good one to get links from;

http://www.p4.no/

Let me know if you need help.

Mark
Well, yes - we got a few headlines in Norway (every 5 weeks or so... :) ), and yes i need help. How do i "collect" them (you use the word swipe - is that the same. Here are some other ursl i want to collect from:

http://www.vg.no
http://www.nettavisen.no
http://www.ntb.no

Thanks for helping...
Jan-Borge
Can you run Perl scripts?  I am putting together something for you now, give me an hour or so and I will post a URL.

Mark
Can you run Perl scripts?  I am putting together something for you now, give me an hour or so and I will post a URL.

Mark
Can you run Perl scripts?  I am putting together something for you now, give me an hour or so and I will post a URL.

Mark
Wow... 3 posts from a single submit...

Anyway, check out this URL, http://www.mgfic.com/getNorUrl.pl

Let me know if it's what you want.
Yes.... almost :)
But... I want just the headlines.. Not the text nor the pictures. Something like you se at the bottom of the page you got the headlines from (http://www.vg.no).

For exampel: I want to make a table with 4-5 columns, and add headlines from 4-5 of the norwegian newspapers. (i think we got so many newspapers in this country :) )

PS! I hope you dont ruin your sosial -or family life over this..

Thanks for responding so far..

Jan-Borge
Hi again.

I just took contact with one of the newspapers. Here is what i "want":
http://www.nettavisen.no/include/nyheter/siste25.txt

Heres the 25 latest headlines from this site. But i order to make use of this page - I have to have a script that collects it every now and then.

But i want also headlines from other papers ... arrrrg.. I think this is a little difficult...., but i still smile

Jan-Borge
Check it again... ;-)

http://www.mgfic.com/getNorUrl.pl

We can do this for any site, a group of sites, I even save the data to a .txt file for later purging, http://www.mgfic.com/temp2.txt  This file can be <!--include file="temp2.txt"--> included, the script can be ran in a batch mode every 4 hours if you want...

Boy I love Perl!....
YES!
This is what i was looking for. But.. :) when i click on the hyperlinks in getNorUrl.pl i dont get the page, but i guess thats a minor issue.

How can i now implement this perl script, and how can i learn more about perl??

Jan-Borge
ASKER CERTIFIED SOLUTION
Avatar of Mark Franz
Mark Franz
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you very much.
Jan-Borge
Yeah thats a minor issue, just need to insert the path to the page in the script.

Here is the script;

# GetNorUrl.pl

# Used to parse an entire file from a specified server and file name.  It will
# then save the parsed file to a location called temp2.txt

use CGI qw(:cgi-lib);
CGI::ReadParse;
use Win32::Internet;

print "Content-type: text/html\n\n";

# Fetch the URL from designated source
$INET = new Win32::Internet();
$htmlfile = $INET->FetchURL("http://www.vg.no/");

$htmlfile =~ s/<!-- START ANNONSER TOPP.*De siste 20 nyhetene\"><BR>//s;
$htmlfile =~ s/<TABLE WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0">.*<TABLE CELLPADDING="6" BORDER="0"><TR>//s;
$htmlfile =~ s/<BR CLEAR=\"all\">.*<\/HTML>/<\/body><\/HTML>/s;
@file = $htmlfile;
foreach $td (@file) {
      $td =~ s/<TD ALIGN="left" VALIGN="top" WIDTH="25%">//g;
      $htmlfile = $td;
}

# Open the temp filehandle for writing and save the resulting string

open (TEMP, ">temp2.txt") || die "Can't open $!";
#while (<IN>) {
   print TEMP $htmlfile;  
#  }
close TEMP;

print <<EOF
$htmlfile
EOF
;

As for learning more about Perl, check with http://www.activestate.com or http://www.cpan.org or http://www.perl.org  Ther eare hundreds of site devoted to Perl, and a couple of great books I use are Learning Perl for Win32 and Programming Perl, both from http://www.ora.com  A great online book store to purchase these books is at http://www.bookpool.com  Fantastic prices on all tech books.

Enjoy,

Mark