Solved

help : Making a automatic link maker

Posted on 2004-03-28
15
313 Views
Last Modified: 2010-04-17
Ok, heres what I'm trying to do, I have thousands of htm, webpages in diffrent directories and I have to link to them on main pages...I'm trying to make a PERL script to steam line my work..

Example http://www.ticketstogo.com/venues/  directory has 3000+ pages and I have to manually add links to
http://www.ticketstogo.com/venues/states/index.htm    <<< one by one , under the correct state.
One each page in the venue section I have the state that that page/venue belongs in example : <!-- Texas --> on the first line of every page. so I need the script to take that data and make links on the http://www.ticketstogo.com/venues/states/ under the right state.

And for the concerts section I would like it to scan the http://www.ticketstogo.com/concerts/ directory and at the top of everypage on the first line I have the first letter of the artist name ..example : <!-- F --->, then take that and input a link to that page under the right letter on page : http://www.ticketstogo.com/concerts/

And there should be a Admin panel where I can add new directories/urls and pick, by state, by first letter. then the location of the links/main page , where the links will be added.

So what I need is a PERL program that can scan the directory and make links to the the pages on a main page, by seeing <!-- "STATE NAME--> or <!--- "LETTER"--> and inserting the link, one after another, on and on, untill its done, saving me hours in the progress...
example of a script that scans and see a <!-- bla bla bla --> and inserts a link..http://www.ccds.addr.com/wrc/links/addalink.htm
My script ended in ruins, I read half of my Perl : A Beginners Guide and I think I'm a perl god...I feel painfully crushed by my attempt..j/k

I know there are better scripting laguages out there, meaning having thousands of htm files, but right now i need this to work , untill our asp.net venture is rdy..which could be months, so please just help with this issues ..
0
Comment
Question by:Caiapfas
  • 11
  • 4
15 Comments
 
LVL 2

Author Comment

by:Caiapfas
ID: 10704792
in effect the only thing this PERL script would do is scan a directory, and then a page(main page) for matching <!-- blablabla, blabla --> tags then insert a link to the page , under the right <!-- -->.

Example , scaning concert/ <folder finds Greenday, greenday.htm has <!-- G, Greenday --> so after its done scanning the entire directory and making a flat file database, it now scans the page(main links page) and finds G's and there is the <!-- G --> so it inserts a link the greenday, Greenday (http://www.mysite.com/concerts/greenday.htm) and it gets the title for the link from the <!-- G, Greenday --> the second part of the hidden comment at the top of the page.
0
 
LVL 11

Expert Comment

by:lbertacco
ID: 10712681
This a small perl script to do it:

#!/usr/bin/perl

$dir="/win/tmp";
$wdir="ven";
$indextpl = "/win/indextpl.htm";
$indexhtm = "/win/index.htm";

# read venue files
opendir(DH, $dir) or die "Can't open $dir for reading: $!";
while(defined($file=readdir(DH))) {
  next unless $file =~ /\.htm$/i;
  open(FILE, "<$dir/$file") or die "Can't open $file for reading: $!";
  $ven{$1}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
  close(FILE);
}

# process template
open(STDIN, "<$indextpl") or die "Can't open $indextpl for reading: $!";
open(STDOUT, ">$indexhtm") or die "Can't open $indexhtm for writing: $!";
while(<>) {
  if(/^\<\!-- (.*) --\>$/) {
    print map {"<A href=\"$ven{$1}{$_}\">$_</A><BR>\n"} sort keys %{$ven{$1}};
  } else {
    print;
  }
}

$dir is the directory with the html pages (e.g. c:\\concert)
$wdir is a path the path to be inserted in the URLs (e.g. /whatever/concerts)
$indextpl is the index template file (e.g. C:\\test\\index.tpl)
$indexhtm is the final index file (e.g. c:\\test\\index.htm)
The script has been tested under unix. For WIn32 you might need to change the slashes in paths from / to \\, and also the perl path in the first line.
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10717880
lbertacco,

does it only match
example
<-- anything, right here -->

also is this the flat file database? $indextpl is the index template file (e.g. C:\\test\\index.tpl)

and can it be a txt file?
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10717901
and if it does match the <!-- whatever, to whatever --> I can use it for many diffrent app.
right?

Thank you,
Caiapfas
0
 
LVL 11

Expert Comment

by:lbertacco
ID: 10721065
It expects the first line of each htm file to be like
<!-- index key, full name -->

then it inserts all the "full names" asscoiated with a given "index key" in the template whenever it finds a line (exactly like)
<!-- index key -->
in the template
the matching of index key is case sensitive (this can be changed) and there must be nothing else in the line (not even extra blanks) (this can be improved).

It doesn't store any flat file index, just keeps it in RAM. All files are expected to be text files.
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10726177
explain..

All files are expected to be text files?

also, i tired making it non case sensitive  , but failed any help

0
 
LVL 11

Expert Comment

by:lbertacco
ID: 10726432
text, that is you can e.g.open them in notepad. This includes html files of course.
To make it case insensitive, change line
$ven{$1}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
to
$ven{lc($1)}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;

and line
print map {"<A href=\"$ven{$1}{$_}\">$_</A><BR>\n"} sort keys %{$ven{$1}};
to
print map {"<A href=\"$ven{lc($1)}{$_}\">$_</A><BR>\n"} sort keys %{$ven{lc($1)}};
0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 2

Author Comment

by:Caiapfas
ID: 10726597
I'm about to crank it up, i have my fingers crossed...
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10727105
lbertacco,

I'm getting a 500 error, I'm running this one a server, remote

$dir="/usr7/home/ttgo/public_html/venues";
$wdir="http://www.ticketstogo.com/veunes/";
$indextpl = "/usr7/home/ttgo/public_html/venues/states/index22.htm";
$indexhtm = "/usr7/home/ttgo/public_html/venues/states//index22.htm";


????
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10727132
also,

i don't understand


# $indextpl is the index template file (e.g. C:\\test\\index.tpl) <<<< what template
# $indexhtm is the final index file (e.g. c:\\test\\index.htm)

I chmod the script to 777
and the directory 777
and the index file 777
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10727205
its erasing the page...


the index file
http://www.ticketstogo.com/venues/states/index22.htm


the script
http://www.ticketstogo.com/cgi-bin/erase.adder.pl

also, what does the script do with the ones it doesn't find a home for?
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10727669
here is the new template...i made half of the <!-- --> on the page and the other half in the html


http://www.ttgo.addr.com/venues/states/tempindex.htm
0
 
LVL 11

Accepted Solution

by:
lbertacco earned 500 total points
ID: 10729891
Caiapfas
well you should have tested the script locally before running it on real pages, right? Also you didn't mention that you want it to run as a CGI. If you run locally on your PC you can verify its workings and get meaningful error descriptions in case of failures, instead of just a 500 error page (which says nothing).

The script does this:
1) looks all the files in directory $dir
2) for each file with an .htm extension, reads the first line
3) if the line is like <!-- indexkey, venuename -->, it stores this info in a multidimensional hash
then
4) reads an index template file $indextpl (an html files that is supposed to have lines like <!-- indexkey --> here and there), line by line
5) if the line doesn't match the pattern <!-- indexkey --> then just write the line unmodified to file $indexhtm
6) otherwise remove line <!--indekey --> and write to $indexhtm a list of links to venues associated to that indexkey
7) reiterate with next line until end of file

The script just ignores the venues for which it doesn't find a home.
$indextpl is the original index page with the <!-- indexkey --> lines. The script reads this file
$indexhtm is the new index page with the lines <!-- indexkey --> removed and replaced with the links. The script write this file
Of course if you run the script without a good $indextpl, it will just erase $indexhtm.
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10732825
holy crap o roo, lol..
took a 2 - 4 day job and made it 1min....
thank you..
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10733981

I'm trying to make the script , display how many links for each catogroy now..
example : it found and linked 300 <!-- whatever, Texas -->

Show after its done it makes a text file and says...

Index/Linked :

300 : Texas
200 : whatever
1000 : whatever also

here
http://www.experts-exchange.com/Programming/Q_20940315.html
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

A short article about a problem I had getting the GPS LocationListener working.
This is an explanation of a simple data model to help parse a JSON feed
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now