Output and page creation from html pages and a template...PERL/would mind making it in javascript

Ok, This perl script scans a directory looking for <!-- balbal, yad yad --> on the first line of every .htm file and then looks at a template and matches the <!-- balbal, yad yad -->'s and inserts links.. Then displays the totals for each <!-- balbal, yad yad --> by displaying the blabla : 1000

What i would like to do with it now is make it display the GRAND total of all the links, and how many it had to dump due to no home. after the display of how many in each catogory.
Example :

Total Links : 2474
Total lost/dumped : 123

and with the dumps, dump them into a text file in link format <a href="http://www.mysite.com/whatever/me.htm>example</a>
for easy cut and paste

and the last option to perfect this script,
I need a prompt that ask ( Do you want to use the template to make a page for (E)each catogroy or (M)one main index page? )
The idea, main catorory is the way its designed now, with the one page..with all the links on it.
but, catogory would be a page for each <!-- new catorgory, blabla --> ...so in effect it takes the template and inserts the link after <!-- START --> instead of matching... reason i would like this option is in the times i have 1000+ links the htm file is pushing 1mb.. so i'd like a page for each catogory..may it be venues/artist..etc

------ add.pl ---------

#!/usr/bin/perl

# $dir is the directory with the html pages (e.g. c:\\concert)
# $wdir is a path the path to be inserted in the URLs (e.g. /whatever/concerts)
# $indextpl is the index template file (e.g. C:\\test\\index.tpl)
# $indexhtm is the final index file (e.g. c:\\test\\index.htm)
# The script has been tested under unix. For WIn32 you might need to change the slashes in paths from / to \\, and also the perl path in the first
# line.

$dir="C:\\web\\ticketstogo.com\\venues";
$wdir="http://www.ticketstogo.com/venues";
$indextpl = "C:\\web\\ticketstogo.com\\venues\\states\\tempindex.htm";
$indexhtm = "C:\\web\\ticketstogo.com\\venues\\states\\aindex.htm";

# read venue files
opendir(DH, $dir) or die "Can't open $dir for reading: $!";
while(defined($file=readdir(DH))) {
  next unless $file =~ /\.htm$/i;
  open(FILE, "<$dir/$file") or die "Can't open $file for reading: $!";
 # $ven{$1}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
    $ven{lc($1)}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
  close(FILE);
}

# process template
open(IDXIN, "<$indextpl") or die "Can't open $indextpl for reading: $!";
open(IDXOUT, ">$indexhtm") or die "Can't open $indexhtm for writing: $!";
while(<IDXIN>) {
  if(/^\<\!-- (.*) --\>$/) {
    my @vkeys = sort keys %{$ven{lc($1)}};
    print "$1: " . @vkeys . "\n";
    print IDXOUT map {"<A href=\"$ven{lc($1)}{$_}\">$_ event tickets</A><BR>\n"} @vkeys;
  } else {
    print IDXOUT;
  }
}
close(IDXIN);
close(IDXOUT);
LVL 2
CaiapfasAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
lbertaccoConnect With a Mentor Commented:
I belived you were going to open a new question, anyway here is it. Not much tested.
Run with command
<scriptname> M
for old beheviour
and
<scriptname> E
for new "each category" behaviour

In the latter case, two new variable define file and paths:
$cattpl the template for each category with the string <!-- START --> inside
$catdir path where category files should be created

#!/usr/bin/perl

$dir="/win/tmp";
$wdir="ven";
$indextpl = "/win/indextpl.htm";
$indexhtm = "/win/index.htm";
$dumphtm = "/win/dump.htm";
$cattpl = "/win/cat.tpl";
$catdir = "/win/tmp/o";

if($#ARGV != 0 || $ARGV[0] !~ /^[me]$/i) {
  print "Usage: $0 {m|e}\n";
  exit;
}

# read venue files
opendir(DH, $dir) or die "Can't open $dir for reading: $!";
while(defined($file=readdir(DH))) {
  next unless $file =~ /\.htm$/i;
  open(FILE, "<$dir/$file") or die "Can't open $file for reading: $!";
  $ven{lc($1)}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
  close(FILE);
}

# process template
my $tot=0;
if(lc($ARGV[0]) eq "m") {
  open(IDXIN, "<$indextpl") or die "Can't open $indextpl for reading: $!";
  open(IDXOUT, ">$indexhtm") or die "Can't open $indexhtm for writing: $!";
  while(<IDXIN>) {
    if(/^\<\!-- (.*) --\>$/) {
      my @vkeys = sort keys %{$ven{lc($1)}};
      print IDXOUT map {"<A href=\"$ven{lc($1)}{$_}\">$_</A><BR>\n"} @vkeys;
      $tot += @vkeys;
      print "\L$1: " . @vkeys . "\n";
      delete($ven{lc($1)});
    } else {
      print IDXOUT;
    }
  }
  close(IDXIN);
  close(IDXOUT);
  print "Total: $tot\n";

  # dump leftovers
  print "\nOrphans:\n";
  $tot=0;
  open(ORPOUT, ">$dumphtm") or die "Can't open $indexhtm for writing: $!";
  foreach $k (sort keys %ven) {
    my @vkeys = sort keys %{$ven{$k}};
    print ORPOUT map {"<A href=\"$ven{$k}{$_}\">$_</A><BR>\n"} @vkeys;
    $tot += @vkeys;
    print "$k: " . @vkeys . "\n";
  }
  close(ORPOUT);
  print "Total: $tot\n";

} else {
      
  open(CATIN, "<$cattpl") or die "Can't open $cattpl for reading: $!";
  {
    local $/;
    $catfile = <CATIN>;
  }
  close(CATIN);
  $catfile =~ /^(.*)\<\!-- START --\>(.*)$/si;
  foreach $k (sort keys %ven) {
    my @vkeys = sort keys %{$ven{$k}};
    open(CATOUT, ">$catdir/$k.htm") or die "Can't open $catdir/$k.htm for writing: $!";
    print CATOUT $1;
    print CATOUT map {"<A href=\"$ven{$k}{$_}\">$_</A><BR>\n"} @vkeys;
    print CATOUT $2;
    close(CATOUT);
    $tot += @vkeys;
    print "$k: " . @vkeys . "\n";
  }
  print "Total: $tot\n";
}
0
 
CaiapfasAuthor Commented:
oo yea , could we do this in javascript<<would perfer it.. I'm hooked on javascript..
0
 
lbertaccoCommented:
This will handle the dump thing:

#!/usr/bin/perl

# $dir is the directory with the html pages (e.g. c:\\concert)
# $wdir is a path the path to be inserted in the URLs (e.g. /whatever/concerts)
# $indextpl is the index template file (e.g. C:\\test\\index.tpl)
# $indexhtm is the final index file (e.g. c:\\test\\index.htm)
# $dumphtm is the output file containing links to pages not inserted in the index (e.g. c:\\test\\dump.htm)

# The script has been tested under unix. For WIn32 you might need to change the slashes in paths from / to \\, and also the perl path in the first
# line.

$dir="C:\\web\\ticketstogo.com\\venues";
$wdir="http://www.ticketstogo.com/venues";
$indextpl = "C:\\web\\ticketstogo.com\\venues\\states\\tempindex.htm";
$indexhtm = "C:\\web\\ticketstogo.com\\venues\\states\\aindex.htm";
$dumphtm = "C:\\web\\ticketstogo.com\\venues\\states\\dump.htm";

# read venue files
opendir(DH, $dir) or die "Can't open $dir for reading: $!";
while(defined($file=readdir(DH))) {
  next unless $file =~ /\.htm$/i;
  open(FILE, "<$dir/$file") or die "Can't open $file for reading: $!";
 # $ven{$1}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
    $ven{lc($1)}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
  close(FILE);
}

# process template
my $tot=0;
open(IDXIN, "<$indextpl") or die "Can't open $indextpl for reading: $!";
open(IDXOUT, ">$indexhtm") or die "Can't open $indexhtm for writing: $!";
while(<IDXIN>) {
  if(/^\<\!-- (.*) --\>$/) {
    my @vkeys = sort keys %{$ven{lc($1)}};
    print IDXOUT map {"<A href=\"$ven{lc($1)}{$_}\">$_ event tickets</A><BR>\n"} @vkeys;
    $tot += @vkeys;
    print "$1: " . @vkeys . "\n";
    delete($ven{$1});
  } else {
    print IDXOUT;
  }
}
close(IDXIN);
close(IDXOUT);
print "Total linked: $tot\n";

# dump leftovers
print "\nOrphans:\n";
$tot=0;
open(ORPOUT, ">$dumphtm") or die "Can't open $dumphtm for writing: $!";
foreach $k (sort keys %ven) {
    my @vkeys = sort keys %{$ven{$k}};
    print ORPOUT map {"<A href=\"$ven{$k}{$_}\">$_</A><BR>\n"} @vkeys;
    $tot += @vkeys;
    print "$k: " . @vkeys . "\n";
}
close(ORPOUT);
print "Total dumped: $tot\n";


-----
for the "each category index page" , I think it's better to just make a (slightly different) separate script. I'll try to do that later.
0
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

 
CaiapfasAuthor Commented:
ok, the dump is outputing the good linked also.. so I dont know which is good and which is bad...example :

Total linked : 2993
Total dumped : 3209


there is only 3209 pages..?

its mistaking everything as orphaned..
0
 
lbertaccoCommented:
you are right, change line
delete($ven{$1});
to
delete($ven{lc($1)});
0
 
CaiapfasAuthor Commented:
ok , perfect..
how would i go about adding which catogory ..it belongs to ...example in the dump.htm
just a long list of urls/links. but before the link i'd like the catogroy it belongs too. for reconigtion.
example :

Alaska << this is the catogroy - bal bal << this is the link
Alaska - lala bala
Alaska - laoal
Texas - meme
Alabama - lolhe
Spain - youto


for the "each category index page" , I think it's better to just make a (slightly different) separate script. I'll try to do that later. <<< Wouldnt it be easier to add to this script..i have been working on it, but made very little progress, unless you call errors progress...lol

0
 
lbertaccoCommented:
add this line after the "foreach" line:
print ORPOUT "<HR><P>\nOrphaned links under category -- <BIG><B> $k </B></BIG> --</P>\n";
0
 
CaiapfasAuthor Commented:
lbertacco,

Any luck on adding the new feature?
0
 
CaiapfasAuthor Commented:
thanks, I opened a new question. If possible i would like it to do a find and replace..when under option e


http://www.experts-exchange.com/Programming/Q_20944346.html
0
All Courses

From novice to tech pro — start learning today.