Solved

Output and page creation from html pages and a template...PERL/would mind making it in javascript

Posted on 2004-04-02
9
286 Views
Last Modified: 2011-09-20
Ok, This perl script scans a directory looking for <!-- balbal, yad yad --> on the first line of every .htm file and then looks at a template and matches the <!-- balbal, yad yad -->'s and inserts links.. Then displays the totals for each <!-- balbal, yad yad --> by displaying the blabla : 1000

What i would like to do with it now is make it display the GRAND total of all the links, and how many it had to dump due to no home. after the display of how many in each catogory.
Example :

Total Links : 2474
Total lost/dumped : 123

and with the dumps, dump them into a text file in link format <a href="http://www.mysite.com/whatever/me.htm>example</a>
for easy cut and paste

and the last option to perfect this script,
I need a prompt that ask ( Do you want to use the template to make a page for (E)each catogroy or (M)one main index page? )
The idea, main catorory is the way its designed now, with the one page..with all the links on it.
but, catogory would be a page for each <!-- new catorgory, blabla --> ...so in effect it takes the template and inserts the link after <!-- START --> instead of matching... reason i would like this option is in the times i have 1000+ links the htm file is pushing 1mb.. so i'd like a page for each catogory..may it be venues/artist..etc

------ add.pl ---------

#!/usr/bin/perl

# $dir is the directory with the html pages (e.g. c:\\concert)
# $wdir is a path the path to be inserted in the URLs (e.g. /whatever/concerts)
# $indextpl is the index template file (e.g. C:\\test\\index.tpl)
# $indexhtm is the final index file (e.g. c:\\test\\index.htm)
# The script has been tested under unix. For WIn32 you might need to change the slashes in paths from / to \\, and also the perl path in the first
# line.

$dir="C:\\web\\ticketstogo.com\\venues";
$wdir="http://www.ticketstogo.com/venues";
$indextpl = "C:\\web\\ticketstogo.com\\venues\\states\\tempindex.htm";
$indexhtm = "C:\\web\\ticketstogo.com\\venues\\states\\aindex.htm";

# read venue files
opendir(DH, $dir) or die "Can't open $dir for reading: $!";
while(defined($file=readdir(DH))) {
  next unless $file =~ /\.htm$/i;
  open(FILE, "<$dir/$file") or die "Can't open $file for reading: $!";
 # $ven{$1}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
    $ven{lc($1)}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
  close(FILE);
}

# process template
open(IDXIN, "<$indextpl") or die "Can't open $indextpl for reading: $!";
open(IDXOUT, ">$indexhtm") or die "Can't open $indexhtm for writing: $!";
while(<IDXIN>) {
  if(/^\<\!-- (.*) --\>$/) {
    my @vkeys = sort keys %{$ven{lc($1)}};
    print "$1: " . @vkeys . "\n";
    print IDXOUT map {"<A href=\"$ven{lc($1)}{$_}\">$_ event tickets</A><BR>\n"} @vkeys;
  } else {
    print IDXOUT;
  }
}
close(IDXIN);
close(IDXOUT);
0
Comment
Question by:Caiapfas
  • 5
  • 4
9 Comments
 
LVL 2

Author Comment

by:Caiapfas
ID: 10741611
oo yea , could we do this in javascript<<would perfer it.. I'm hooked on javascript..
0
 
LVL 11

Expert Comment

by:lbertacco
ID: 10741808
This will handle the dump thing:

#!/usr/bin/perl

# $dir is the directory with the html pages (e.g. c:\\concert)
# $wdir is a path the path to be inserted in the URLs (e.g. /whatever/concerts)
# $indextpl is the index template file (e.g. C:\\test\\index.tpl)
# $indexhtm is the final index file (e.g. c:\\test\\index.htm)
# $dumphtm is the output file containing links to pages not inserted in the index (e.g. c:\\test\\dump.htm)

# The script has been tested under unix. For WIn32 you might need to change the slashes in paths from / to \\, and also the perl path in the first
# line.

$dir="C:\\web\\ticketstogo.com\\venues";
$wdir="http://www.ticketstogo.com/venues";
$indextpl = "C:\\web\\ticketstogo.com\\venues\\states\\tempindex.htm";
$indexhtm = "C:\\web\\ticketstogo.com\\venues\\states\\aindex.htm";
$dumphtm = "C:\\web\\ticketstogo.com\\venues\\states\\dump.htm";

# read venue files
opendir(DH, $dir) or die "Can't open $dir for reading: $!";
while(defined($file=readdir(DH))) {
  next unless $file =~ /\.htm$/i;
  open(FILE, "<$dir/$file") or die "Can't open $file for reading: $!";
 # $ven{$1}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
    $ven{lc($1)}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
  close(FILE);
}

# process template
my $tot=0;
open(IDXIN, "<$indextpl") or die "Can't open $indextpl for reading: $!";
open(IDXOUT, ">$indexhtm") or die "Can't open $indexhtm for writing: $!";
while(<IDXIN>) {
  if(/^\<\!-- (.*) --\>$/) {
    my @vkeys = sort keys %{$ven{lc($1)}};
    print IDXOUT map {"<A href=\"$ven{lc($1)}{$_}\">$_ event tickets</A><BR>\n"} @vkeys;
    $tot += @vkeys;
    print "$1: " . @vkeys . "\n";
    delete($ven{$1});
  } else {
    print IDXOUT;
  }
}
close(IDXIN);
close(IDXOUT);
print "Total linked: $tot\n";

# dump leftovers
print "\nOrphans:\n";
$tot=0;
open(ORPOUT, ">$dumphtm") or die "Can't open $dumphtm for writing: $!";
foreach $k (sort keys %ven) {
    my @vkeys = sort keys %{$ven{$k}};
    print ORPOUT map {"<A href=\"$ven{$k}{$_}\">$_</A><BR>\n"} @vkeys;
    $tot += @vkeys;
    print "$k: " . @vkeys . "\n";
}
close(ORPOUT);
print "Total dumped: $tot\n";


-----
for the "each category index page" , I think it's better to just make a (slightly different) separate script. I'll try to do that later.
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10742093
ok, the dump is outputing the good linked also.. so I dont know which is good and which is bad...example :

Total linked : 2993
Total dumped : 3209


there is only 3209 pages..?

its mistaking everything as orphaned..
0
 
LVL 11

Expert Comment

by:lbertacco
ID: 10742286
you are right, change line
delete($ven{$1});
to
delete($ven{lc($1)});
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 2

Author Comment

by:Caiapfas
ID: 10742474
ok , perfect..
how would i go about adding which catogory ..it belongs to ...example in the dump.htm
just a long list of urls/links. but before the link i'd like the catogroy it belongs too. for reconigtion.
example :

Alaska << this is the catogroy - bal bal << this is the link
Alaska - lala bala
Alaska - laoal
Texas - meme
Alabama - lolhe
Spain - youto


for the "each category index page" , I think it's better to just make a (slightly different) separate script. I'll try to do that later. <<< Wouldnt it be easier to add to this script..i have been working on it, but made very little progress, unless you call errors progress...lol

0
 
LVL 11

Expert Comment

by:lbertacco
ID: 10742528
add this line after the "foreach" line:
print ORPOUT "<HR><P>\nOrphaned links under category -- <BIG><B> $k </B></BIG> --</P>\n";
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10758860
lbertacco,

Any luck on adding the new feature?
0
 
LVL 11

Accepted Solution

by:
lbertacco earned 500 total points
ID: 10760340
I belived you were going to open a new question, anyway here is it. Not much tested.
Run with command
<scriptname> M
for old beheviour
and
<scriptname> E
for new "each category" behaviour

In the latter case, two new variable define file and paths:
$cattpl the template for each category with the string <!-- START --> inside
$catdir path where category files should be created

#!/usr/bin/perl

$dir="/win/tmp";
$wdir="ven";
$indextpl = "/win/indextpl.htm";
$indexhtm = "/win/index.htm";
$dumphtm = "/win/dump.htm";
$cattpl = "/win/cat.tpl";
$catdir = "/win/tmp/o";

if($#ARGV != 0 || $ARGV[0] !~ /^[me]$/i) {
  print "Usage: $0 {m|e}\n";
  exit;
}

# read venue files
opendir(DH, $dir) or die "Can't open $dir for reading: $!";
while(defined($file=readdir(DH))) {
  next unless $file =~ /\.htm$/i;
  open(FILE, "<$dir/$file") or die "Can't open $file for reading: $!";
  $ven{lc($1)}{$2} = "$wdir/$file" if <FILE> =~ /^\<\!-- (.*), (.*) --\>$/;
  close(FILE);
}

# process template
my $tot=0;
if(lc($ARGV[0]) eq "m") {
  open(IDXIN, "<$indextpl") or die "Can't open $indextpl for reading: $!";
  open(IDXOUT, ">$indexhtm") or die "Can't open $indexhtm for writing: $!";
  while(<IDXIN>) {
    if(/^\<\!-- (.*) --\>$/) {
      my @vkeys = sort keys %{$ven{lc($1)}};
      print IDXOUT map {"<A href=\"$ven{lc($1)}{$_}\">$_</A><BR>\n"} @vkeys;
      $tot += @vkeys;
      print "\L$1: " . @vkeys . "\n";
      delete($ven{lc($1)});
    } else {
      print IDXOUT;
    }
  }
  close(IDXIN);
  close(IDXOUT);
  print "Total: $tot\n";

  # dump leftovers
  print "\nOrphans:\n";
  $tot=0;
  open(ORPOUT, ">$dumphtm") or die "Can't open $indexhtm for writing: $!";
  foreach $k (sort keys %ven) {
    my @vkeys = sort keys %{$ven{$k}};
    print ORPOUT map {"<A href=\"$ven{$k}{$_}\">$_</A><BR>\n"} @vkeys;
    $tot += @vkeys;
    print "$k: " . @vkeys . "\n";
  }
  close(ORPOUT);
  print "Total: $tot\n";

} else {
      
  open(CATIN, "<$cattpl") or die "Can't open $cattpl for reading: $!";
  {
    local $/;
    $catfile = <CATIN>;
  }
  close(CATIN);
  $catfile =~ /^(.*)\<\!-- START --\>(.*)$/si;
  foreach $k (sort keys %ven) {
    my @vkeys = sort keys %{$ven{$k}};
    open(CATOUT, ">$catdir/$k.htm") or die "Can't open $catdir/$k.htm for writing: $!";
    print CATOUT $1;
    print CATOUT map {"<A href=\"$ven{$k}{$_}\">$_</A><BR>\n"} @vkeys;
    print CATOUT $2;
    close(CATOUT);
    $tot += @vkeys;
    print "$k: " . @vkeys . "\n";
  }
  print "Total: $tot\n";
}
0
 
LVL 2

Author Comment

by:Caiapfas
ID: 10761003
thanks, I opened a new question. If possible i would like it to do a find and replace..when under option e


http://www.experts-exchange.com/Programming/Q_20944346.html
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

I know it’s not a new topic to discuss and it has lots of online contents already available over the net. But Then I thought it would be useful to this site’s visitors and can have online repository on vim most commonly used commands. This post h…
This article will show, step by step, how to integrate R code into a R Sweave document
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

26 Experts available now in Live!

Get 1:1 Help Now