asked on

Frontpage - "collect" headlines from newspapers

Hi!
I just wonder how to collect headlines (updated headlines) (5 latest headlines) from some of the newspapers in the country. I use Frontpage 98. Do i have to add a script, make a database??

Mark Franz

Check out this link, http://www.mgfic.com/getfile.pl

I use a small Perl script that grabs headlines from Isyndicate.com If all you want to do is insert lines of code for headlines check out www.iSyndicate.com

jblandro

ASKER

I've checked it out. It looks good, but i can't get headlines from my own country - norway. Can anyone help me with that....

Mark Franz

Does Norway have headlines? ;-) (just kidding...)

Check with the AP wire for your Country, or a national newspaper for headlines. If they post news, I can swipe it. CNN, BBC, ITN and The Norway Post all post headlines, what do you want to display?

Here are some links;

http://www.dagbladet.no/
http://www.dagogtid.no/

And this one looks like a good one to get links from;

http://www.p4.no/

Let me know if you need help.

Mark

jblandro

ASKER

Well, yes - we got a few headlines in Norway (every 5 weeks or so... :) ), and yes i need help. How do i "collect" them (you use the word swipe - is that the same. Here are some other ursl i want to collect from:

http://www.vg.no
http://www.nettavisen.no
http://www.ntb.no

Thanks for helping...
Jan-Borge

Mark Franz

Can you run Perl scripts? I am putting together something for you now, give me an hour or so and I will post a URL.

Mark

Mark Franz

Can you run Perl scripts? I am putting together something for you now, give me an hour or so and I will post a URL.

Mark

Mark Franz

Can you run Perl scripts? I am putting together something for you now, give me an hour or so and I will post a URL.

Mark

Mark Franz

Wow... 3 posts from a single submit...

Anyway, check out this URL, http://www.mgfic.com/getNorUrl.pl

Let me know if it's what you want.

jblandro

ASKER

Yes.... almost :)
But... I want just the headlines.. Not the text nor the pictures. Something like you se at the bottom of the page you got the headlines from (http://www.vg.no).

For exampel: I want to make a table with 4-5 columns, and add headlines from 4-5 of the norwegian newspapers. (i think we got so many newspapers in this country :) )

PS! I hope you dont ruin your sosial -or family life over this..

Thanks for responding so far..

Jan-Borge

jblandro

ASKER

Hi again.

I just took contact with one of the newspapers. Here is what i "want":
http://www.nettavisen.no/include/nyheter/siste25.txt

Heres the 25 latest headlines from this site. But i order to make use of this page - I have to have a script that collects it every now and then.

But i want also headlines from other papers ... arrrrg.. I think this is a little difficult...., but i still smile

Jan-Borge

Mark Franz

Check it again... ;-)

http://www.mgfic.com/getNorUrl.pl

We can do this for any site, a group of sites, I even save the data to a .txt file for later purging, http://www.mgfic.com/temp2.txt This file can be  included, the script can be ran in a batch mode every 4 hours if you want...

Boy I love Perl!....

jblandro

ASKER

YES!
This is what i was looking for. But.. :) when i click on the hyperlinks in getNorUrl.pl i dont get the page, but i guess thats a minor issue.

How can i now implement this perl script, and how can i learn more about perl??

Jan-Borge

ASKER CERTIFIED SOLUTION

Mark Franz

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

jblandro

ASKER

Thank you very much.
Jan-Borge

Mark Franz

Yeah thats a minor issue, just need to insert the path to the page in the script.

Here is the script;

# GetNorUrl.pl

# Used to parse an entire file from a specified server and file name. It will
# then save the parsed file to a location called temp2.txt

use CGI qw(:cgi-lib);
CGI::ReadParse;
use Win32::Internet;

print "Content-type: text/html\n\n";

# Fetch the URL from designated source
$INET = new Win32::Internet();
$htmlfile = $INET->FetchURL("http://www.vg.no/");

$htmlfile =~ s/<!-- START ANNONSER TOPP.*De siste 20 nyhetene\"><BR>//s;
$htmlfile =~ s/<TABLE WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0">.*<TABLE CELLPADDING="6" BORDER="0"><TR>//s;
$htmlfile =~ s/<BR CLEAR=\"all\">.*<\/HTML>/<\/body><\/HTML>/s;
@file = $htmlfile;
foreach $td (@file) {
$td =~ s/<TD ALIGN="left" VALIGN="top" WIDTH="25%">//g;
$htmlfile = $td;
}

# Open the temp filehandle for writing and save the resulting string

open (TEMP, ">temp2.txt") || die "Can't open $!";
#while (<IN>) {
print TEMP $htmlfile;
# }
close TEMP;

print <<EOF
$htmlfile
EOF
;

As for learning more about Perl, check with http://www.activestate.com or http://www.cpan.org or http://www.perl.org Ther eare hundreds of site devoted to Perl, and a couple of great books I use are Learning Perl for Win32 and Programming Perl, both from http://www.ora.com A great online book store to purchase these books is at http://www.bookpool.com Fantastic prices on all tech books.

Enjoy,

Mark

Frontpage - &quot;collect&quot; headlines from newspapers

Frontpage - "collect" headlines from newspapers