We help IT Professionals succeed at work.

Check out our new AWS podcast with Certified Expert, Phil Phillips! Listen to "How to Execute a Seamless AWS Migration" on EE or on your favorite podcast platform. Listen Now

x

help : Making a automatic link maker

Caiapfas
Caiapfas asked
on
Medium Priority
375 Views
Last Modified: 2010-04-17
Ok, heres what I'm trying to do, I have thousands of htm, webpages in diffrent directories and I have to link to them on main pages...I'm trying to make a PERL script to steam line my work..

Example http://www.ticketstogo.com/venues/  directory has 3000+ pages and I have to manually add links to
http://www.ticketstogo.com/venues/states/index.htm    <<< one by one , under the correct state.
One each page in the venue section I have the state that that page/venue belongs in example : <!-- Texas --> on the first line of every page. so I need the script to take that data and make links on the http://www.ticketstogo.com/venues/states/ under the right state.

And for the concerts section I would like it to scan the http://www.ticketstogo.com/concerts/ directory and at the top of everypage on the first line I have the first letter of the artist name ..example : <!-- F --->, then take that and input a link to that page under the right letter on page : http://www.ticketstogo.com/concerts/

And there should be a Admin panel where I can add new directories/urls and pick, by state, by first letter. then the location of the links/main page , where the links will be added.

So what I need is a PERL program that can scan the directory and make links to the the pages on a main page, by seeing <!-- "STATE NAME--> or <!--- "LETTER"--> and inserting the link, one after another, on and on, untill its done, saving me hours in the progress...
example of a script that scans and see a <!-- bla bla bla --> and inserts a link..http://www.ccds.addr.com/wrc/links/addalink.htm
My script ended in ruins, I read half of my Perl : A Beginners Guide and I think I'm a perl god...I feel painfully crushed by my attempt..j/k

I know there are better scripting laguages out there, meaning having thousands of htm files, but right now i need this to work , untill our asp.net venture is rdy..which could be months, so please just help with this issues ..
Comment
Watch Question

Author

Commented:
in effect the only thing this PERL script would do is scan a directory, and then a page(main page) for matching <!-- blablabla, blabla --> tags then insert a link to the page , under the right <!-- -->.

Example , scaning concert/ <folder finds Greenday, greenday.htm has <!-- G, Greenday --> so after its done scanning the entire directory and making a flat file database, it now scans the page(main links page) and finds G's and there is the <!-- G --> so it inserts a link the greenday, Greenday (http://www.mysite.com/concerts/greenday.htm) and it gets the title for the link from the <!-- G, Greenday --> the second part of the hidden comment at the top of the page.
This a small perl script to do it:

#!/usr/bin/perl

$dir="/win/tmp";
$wdir="ven";
$indextpl = "/win/indextpl.htm";
$indexhtm = "/win/index.htm";

# read venue files
opendir(DH, $dir) or die "Can't open $dir for reading: $!";
while(defined($file=readdir(DH))) {
  next unless $file =~ /\.htm$/i;
  open(FILE, "<$dir/$file") or die "Can't open $file for reading: $!";
  $ven{$1}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
  close(FILE);
}

# process template
open(STDIN, "<$indextpl") or die "Can't open $indextpl for reading: $!";
open(STDOUT, ">$indexhtm") or die "Can't open $indexhtm for writing: $!";
while(<>) {
  if(/^\<\!-- (.*) --\>$/) {
    print map {"<A href=\"$ven{$1}{$_}\">$_</A><BR>\n"} sort keys %{$ven{$1}};
  } else {
    print;
  }
}

$dir is the directory with the html pages (e.g. c:\\concert)
$wdir is a path the path to be inserted in the URLs (e.g. /whatever/concerts)
$indextpl is the index template file (e.g. C:\\test\\index.tpl)
$indexhtm is the final index file (e.g. c:\\test\\index.htm)
The script has been tested under unix. For WIn32 you might need to change the slashes in paths from / to \\, and also the perl path in the first line.

Author

Commented:
lbertacco,

does it only match
example
<-- anything, right here -->

also is this the flat file database? $indextpl is the index template file (e.g. C:\\test\\index.tpl)

and can it be a txt file?

Author

Commented:
and if it does match the <!-- whatever, to whatever --> I can use it for many diffrent app.
right?

Thank you,
Caiapfas
It expects the first line of each htm file to be like
<!-- index key, full name -->

then it inserts all the "full names" asscoiated with a given "index key" in the template whenever it finds a line (exactly like)
<!-- index key -->
in the template
the matching of index key is case sensitive (this can be changed) and there must be nothing else in the line (not even extra blanks) (this can be improved).

It doesn't store any flat file index, just keeps it in RAM. All files are expected to be text files.

Author

Commented:
explain..

All files are expected to be text files?

also, i tired making it non case sensitive  , but failed any help

text, that is you can e.g.open them in notepad. This includes html files of course.
To make it case insensitive, change line
$ven{$1}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
to
$ven{lc($1)}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;

and line
print map {"<A href=\"$ven{$1}{$_}\">$_</A><BR>\n"} sort keys %{$ven{$1}};
to
print map {"<A href=\"$ven{lc($1)}{$_}\">$_</A><BR>\n"} sort keys %{$ven{lc($1)}};

Author

Commented:
I'm about to crank it up, i have my fingers crossed...

Author

Commented:
lbertacco,

I'm getting a 500 error, I'm running this one a server, remote

$dir="/usr7/home/ttgo/public_html/venues";
$wdir="http://www.ticketstogo.com/veunes/";
$indextpl = "/usr7/home/ttgo/public_html/venues/states/index22.htm";
$indexhtm = "/usr7/home/ttgo/public_html/venues/states//index22.htm";


????

Author

Commented:
also,

i don't understand


# $indextpl is the index template file (e.g. C:\\test\\index.tpl) <<<< what template
# $indexhtm is the final index file (e.g. c:\\test\\index.htm)

I chmod the script to 777
and the directory 777
and the index file 777

Author

Commented:
its erasing the page...


the index file
http://www.ticketstogo.com/venues/states/index22.htm


the script
http://www.ticketstogo.com/cgi-bin/erase.adder.pl

also, what does the script do with the ones it doesn't find a home for?

Author

Commented:
here is the new template...i made half of the <!-- --> on the page and the other half in the html


http://www.ttgo.addr.com/venues/states/tempindex.htm
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION

Author

Commented:
holy crap o roo, lol..
took a 2 - 4 day job and made it 1min....
thank you..

Author

Commented:

I'm trying to make the script , display how many links for each catogroy now..
example : it found and linked 300 <!-- whatever, Texas -->

Show after its done it makes a text file and says...

Index/Linked :

300 : Texas
200 : whatever
1000 : whatever also

here
https://www.experts-exchange.com/Programming/Q_20940315.html
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.