How to refresh Files?

How can I refresh files that are receiving the outputs?

Currently I have to HIT the REFRESH BUTTON everytime I enter a new URL and convert it to ASCII....

How can I refresh the files without having to hit REFRESH?


Here' the  code.. url.cgi
#!/usr/bin/perl

##-I/web/public/grad/sdesar
################################################################################################
#This script does the following
#1. file.txt- converts html to ascii
#2. file.html- fetch the html file.
#3. fileParse.txt - Parse the articles, prepositions, etc. to create the keywords to be analyzed.
#4. fileHeader.html- gets the headers from file.html and creates anchors.
#5. stop_words- List of words which are to be  parsed.... a, the, and, etc.,
#   I got this from http://www.nzdl.org
#
#This file contains the following Routines
#
#1. convert to plain_text
#2. copy HTML file
#3. parse data
#4. get all the headers and place it after the <BODY> tag and make them anchors
#Here's the list of INPUT AND OUTPUT FILES-
#file.txt - ASCII file
#file.html - HTML file
#fileParse.txt-  parses articles, prepositions etc
#                Input file - file.txt, stop_words
#                Output file - fileParse.txt
#fileHeader.html - Gets the Headers from HTML document
#               Input file - file.html
#                Output file - fileHeader.html
#fileKeys.txt    - Displays the 5 most freq. words per parah
#                  and 10 most freq. words in the entire document.
#                  Input - fileParse.txt
#                  Output - fileKeys.txt
#fileKW.html    - Displays the 10 mmost frew. words as BOLD in the Parsed file.
#                 Input - fileParse.txt
#                 Output - fileKW.html
#keywords.out - Parses the keywords from fileKeys.txt
#              Input-   fileKeys.txt
#               Output-  keywords.out
#fileBold.html - Bolds all the words.
#                Input- file.html
#                Input- keywords.out
#                Output- fileBold.html
#find.html       - Independent javascript to find keywords.
#
###############################################################################################

use LWP::Simple;
use HTML::Parser;
use CGI;

# require "boldparse.pl";
#require "wrapper.cgi";

my $cgi = new CGI;
my $url = $cgi->param('url');

my $HTTP_ROOT = "/web/public/grad/sdesar/tmp/file.txt";
my $HTTP_ROOT1 = "/web/public/grad/sdesar/tmp/file.html";
my $HTTP_ROOT2 = "/web/public/grad/sdesar/tmp/fileParse.txt";

if ( $url ne "" )
{
$content = fetch( $url );
if ( $content ne "" )
{
print $cgi->header( -type => 'text/plain' );
my $plain_text = plain_text($content);
print $plain_text;
print $content;
my $plain_text = plain_text( $content );

print $plain_text, "\n";

cp_to_file( $plain_text, "$HTTP_ROOT" );
print $cgi->redirect( "./tmp/file.txt" );

##########routine to parse data
print $cgi->header( -type => 'text/plain' );
open KW, 'stop_words';
@kw = map {chop;$_} <KW>;
close KW;

#form RE

$re = join '\b)|(\b','(\b', @kw,'\b)';

open TXT,'./tmp/file.txt';
while(<TXT>){
  s/$re//goi;
  $contentParse .= $_;
}
print $contentParse, "\n";


cp_to_file($contentParse, "$HTTP_ROOT2");
print $cgi->redirect( "./tmp/fileParse.txt" );

#print "\n\n\n";



##########copy the HTML

print $cgi->header( -type => 'text/plain' );
my $cp_to_file_html = cp_to_file_html($content);
print $cp_to_file_html;
print $content;
my $cp_to_file_html = cp_to_file_html( $content );

print $cp_to_file_html, "\n";

cp_to_file_html( $cp_to_file_html, "$HTTP_ROOT1" );
print $cgi->redirect( "./tmp/file.html" );

###############COPY TO KEYWORDS.out

print $cgi->header( -type => 'text/plain' );

$status=&parsenwrite("./tmp/fileKeys.txt","./tmp/keywords.out");
print $cgi->redirect( "./tmp/keywords.out" );
print $cgi->header( -type => 'text/plain' );
#print "Content-type: text/html\n\n";
if ($status){   ##  The Parse'n Write sub-routine was fine
        ##  Now read the html file and bold the keywords
        &makebold("./tmp/file.html","./tmp/keywords.out","./tmp/fileBold.html");
print $cgi->redirect( "./tmp/fileBold.html" );
}else{
        print "Error during parsewrite\n";
}




}
else
{
output_form( "Could not load URL: $url<br>" );
}
}
else
{
output_form( "Enter URL to fetch" );
}

sub output_form
{
my $msg = shift;

# output the html header
print $cgi->header( -type => 'text/html' );

# print the message if there is one
print "$msg<br>\n";

# output the form for the user
print $cgi->start_html;
print $cgi->start_form;
print "Please enter another URL:  ";
print $cgi->textfield( -name=>'url', -value=>'http://www.' );
#print $cgi->textfield('url');
print $cgi->br;
print $cgi->submit( -label => 'Fetch' );
print $cgi->end_form;
print $cgi->end_html;
}

print <<"PrintTag";
<html><head>
<title>CGI-Generated HTML</title>
</head><body>
<H2 align="center">WEB TEXTURIZER</H2>
<HR>
<H2> The  following files will be created: <H2>
<H3> Please hit RELOAD to REFRESH these files. <H3>


<UL>

<LI><A HREF="./tmp/file.txt"
               TARGET="results">
               Text Only Version</A>
        <LI><A HREF="./tmp/file.html"
               TARGET="results">
               HTML Version</A>
        <LI><A HREF="./tmp/fileKW.html"
               TARGET="results">
               KeyWords and BOLD Them-in Parsed File-- Do it in HTML-Headers</A>
        <LI><A HREF="./tmp/fileParse.txt"
               TARGET="results">
               Parsed  Version</A>
        <LI><A HREF="./tmp/fileHeader.html"
               TARGET="results">
              HTML  along with the Headers that have anchors created </A>
        <LI><A HREF="./tmp/find.html"
               TARGET="results">
               HTML  and find the keywords  -- JavaScript</A>
        <LI><A HREF="./tmp/fileKeys.txt"
               TARGET="results">
               Finds the most frequent words per Paragraph and Total</A>


        <LI><A HREF="./tmp/keywords.out"
               TARGET="results">
                Keywords in ASCII</A>

        <LI><A HREF="./tmp/keywords.html"
               TARGET="results">
                Keywords in   HTML  with anchors created Version</A>


        <LI><A HREF="./tmp/fileKeywords.html"
               TARGET="results">
                Finds Keywords  and creates the anchorsin BOLD HTML Version</A>



<LI><A HREF="/public/grad/sdesar/wrapper.cgi?f=file.txt">

               Text Only Version - file.txt</A>


</UL>
<HR>
</body></html>
PrintTag
#Line above has the magic word that
#makes the browser stop printing
#End of program

#print "Content-type: text/plain \n\n";
#print "TEST";




#### subroutines
sub fetch {
my ($url) = @_;
my $cont;

$cont = get($url);
return $cont;
}

# copies text to file

sub cp_to_file {
my ($text, $to_file) = @_;

open(OUT, ">" . $to_file);
print OUT $text;
close(OUT);
}

# copies file at HTML to a file

sub cp_to_file_html {
my ($text, $to_file) = @_;

open(OUT, ">" . $to_file);
print OUT $text;
return $text;
close(OUT);
}


# converts html text into plain text; (simplistic approach)
sub plain_text {
my ($in_text) = @_;
my $plain;

($plain = $in_text) =~ s/<[^>]*>//gs;

return $plain;
}

##############Clear cache

use CGI;

$query=new CGI;
my $file_name=$query->param('f');
my $file_path="/web/public/grad/sdesar/tmp/";

open(OUT,$file_path.$file_name) || die$!;


print "Content-type: text/html\n\n" if $file_name!~ /\.txt/;
print "Content-type: text/plain\n\n" if $file_name=~ /\.txt/;

print "<meta http-equiv=\"Pragma\" content=\"no-cache\">
<meta http-equiv=\"expires\" content=\"0\">";

while(<OUT>){print $_;}
close(OUT);




##############SUBROUTINE to add anchors to headers
open(FILE, "./tmp/file.html");
                   #   open(FILE, "$ARGV[0]");

                   #   @File = <FILE>;
                      @File = <FILE>;
                      $html = join(" ", @File);

                      close (FILE);

                      #Match Headers
                      (@headers) =($html=~m!<H\d>\s*(.*?)\s*</H\d>!isg);


                      #Convert all headers into named anchors
                      $html =~ s!(<H\d>\s*)(.*?)(\s*</H\d>)!$1<a name="$2">$2</a>$3!isg;

                      #Construct links to headers
                      foreach $header (@headers)
                      {
                      $links .= qq(\n<a href="#$header">$header</a><br>\n);

                      }

                      #Place links at top of page after <Body> tag
                      $html =~ s/(<body[^>]*>)/$1$links/i;
                      print $html;

                      #Write out new document

                      cp_to_file_html($html, "./tmp/fileHeader.html");
                      print "\n\n\n";


#############SUBROUTINE FOR KEYWORDS
sub keywords{
                         my $file = shift;
                         open FILE,"<$file" or die "can't open $file : $!";
                         my %wc=();
                         my %seen = ();
                         my %top;
                         my @words;
                         my $paragraphs='';
                         local $/='';
                         my @paragraphs = <FILE>;
                         close FILE;
                         for( @paragraphs ){
                             while( /(\w['\w-]*)/g ){
                                 $seen{lc $1}++;
                             }
                         }
                         @top{(sort {$seen{$b} <=> $seen{$a} } keys %seen)[0..9]} = ();
                         for( @paragraphs ){
                             %wc = ();
                             if( @words = grep {exists $top{lc $_} && !$wc{lc $_}++} /(\w['\w-]*)/g ){
                                 for my $w ( @words ){ s/\b(\Q$w\E)\b/<b>$1<\/b>/gi; }
                                 $paragraphs .= join ' ',"<h1>",@words,"</h1>:\n$_";
                             }
                         }
                         $paragraphs=~s/$/<br>/gm;
                         return $paragraphs;
                     }


$paragraphs=keywords('./tmp/fileParse.txt');
print "$_:$paragraphs";

open FILE2, ">./tmp/fileKW.html" or die "can't open fileKW because $!";
print FILE2 keywords("./tmp/fileParse.txt");
close FILE2;

#########################Routine to count the words per Parah and total

               open IN,"<./tmp/fileParse.txt" or die "can't open fileParse.txt:$!";
               open OUT,">./tmp/fileKeys.txt" or die "can't open fileKeys.txt:$!";
               {local $/='';
                  while( <IN> ){
                     %wc = ();
                     while( /(\w['\w-]*)/g ){
                         $seen{lc $1}++;
                         $wc{lc $1}++;
                     }
                     print OUT "paragraph $.\n";
                     for( (sort {$wc{$b} <=> $wc{$a} } keys %wc)[0..4] ){
                         print OUT "$_ : $wc{$_}\n";
                     }
                  }
               }
               print OUT "total\n";
               for( (sort {$seen{$b} <=> $seen{$a} } keys %seen)[0..9] ){
                  printf OUT "%5d %s\n", $seen{$_}, $_;
               }

################ROUTINE FOR BOLDFACE THE KEYWORDS IN HTML FILE

# require "boldparse.pl";
sub parsenwrite{
        ($filekeys,$keywords)=@_;

        open (FILEKEYS,$filekeys) || die "can't open $filekeys: $!\n";

        $ctr=0;
        while ($line=<FILEKEYS>){
                $ctr++;
                chomp($line);   ##      Remove the \n char
                ##      Ignore lines with paragraph 1, paragraph 2 etc & ...
                ##      line having only a ":" in it.
                next if $line=~ /^paragraph\s+\d+/ || $line=~ /^:$/;

                ##      Check for lines which have white spaces followed by
                ##      numbers and then have a word. Eg.    20 information
                if ($line=~ /\s+\d+\s+(.*)/){
                        $keywords{$1}=1;
                        next;   ##      Go for the next line
                }

                ##      All remaining lines WILL have the foll format
                ##      word : number
                @tmp=split(/:/,$line);

                if ($#tmp>0){   ##      The line has the above format.
                        $tmp[0]=~ s/\s+//g;     ##      Squeeze out white spaces
                        $keywords{$tmp[0]}=1;
                }
        }
        close(FILEKEYS);

        ##      We are using an associative array to eliminate any
        ##      duplicate keywords we might have in the input text file.
        open (KEYWORDS,">$keywords") || die "can't open $keywords: $!\n";
        foreach(sort keys %keywords){
             print KEYWORDS $_,"\n"; ##      Write to the keyword output file
             print "\n\n\n";


        }
        close(KEYWORDS);

        return 1;

}
#######################CREATE ANCHORS to the keywords##################
# This uses 4 files:
                open(KI, "<./tmp/keywords.out")   or die; # simple keywords, one per line
                open(KO, ">./tmp/keywords.html")  or die; # The htmlized keywords
                open(AI, "<./tmp/file.html") or die; # The original HTML document
                open(AO, ">./tmp/fileKeywords.html")        or die; # The bold/tagged HTML document

                @keywords = <KI>; # grab all the keywords
                chomp @keywords; chomp @keywords;  # Remove linefeeds

                # Make sure keywords are unique. I assume only 1 kw per document is needed
                @keywords = grep { !$seen{$_}++ } @keywords;

                print KO<<EOF; # This is the start of the keywords.html doc
                <HTML>
                <HEAD>
<style type="text/css">
A {text-decoration:none}
</style>
</head>
                  <title>This is the Keywords Document</title>
EOF
                undef $/; # turn of line-at-a-time processing, and suck up whole files
                # Assumption: You have enough RAM to load in fileKeywords.html into memory.

                ($head,$_) = split /<BODY/i, <AI>; # read in HTML.
                # Strip off everything before body tag, since we can't manipulate it

                foreach $k (@keywords)
                {
                  $k =~ s/\s//g; # No whitespace allowed in keyword (otherwise, need to
                  # mess around with the link -- it can't have spaces.)
                  print KO "<A HREF='http://jbh3-1.csci.csusb.edu/public/grad/sdesar/tmp/fileKeywords.html#$k' target=defsbox>$k</A><BR>\n"; # add outbound link
                  s!$k!<A NAME='$k'><B>$k</B></A>!; # Create inbound link
                # I assume that none of the keywords are subsets of the other keywords..
                }

                print AO "$head<BODY$_";


I think the problem is with above routine... CREATE ANCHORS to the KEYWORDS... the file - keywords.html is NOT being updated...
sdesarAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

guadalupeCommented:
I think you've gotten low response on this one because it is not unclear what you want.  One idea is that you force the browser to refresh the page automatically every x seconds like this:

<META HTTP-EQUIV="Refresh" CONTENT="300">

This will do it every 300 seconds (5 min)

If this is not what you want try and explain a little more and I'll see if I can help...
0
sdesarAuthor Commented:
Oh .. good..
But I want the files to be refreshed in 1-5 seconds.
0
ozoCommented:
Change the "300" to "5"
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

sdesarAuthor Commented:
Where should I place this in the above
script?

<META HTTP-EQUIV="Refresh" CONTENT="5">
0
maneshrCommented:
ok heres what you need to do.

    first before your click the fetch button, clear your disk and memory cache.

    alternately what you can do is... (assuming you are using Netscape)

    1 - fetch a URL.
    2 - when you get the results, move your mouse pointer over the hyperlink and right click.
    select "Open in new window".

    Now you will have 2 browser windows open. one with the Web texturizer interface and other
    wil the actual file.

    3 - now fetch another URL
    4 - when you get the results, just go to the other page, keep the shift key pressed and
    click on the reload icon of your browser.

    if you see the contents of the page changing, that means the reload is fine, its just the    cache that is giving you the problem.

0
guadalupeCommented:
Or put the tag:

<META HTTP-EQUIV="Refresh" CONTENT="5">


between the head tags like this:

<head>

<META HTTP-EQUIV="Refresh" CONTENT="5">

<title>Title</title>

</head>

0
sdesarAuthor Commented:
Thanks for the suggestions but nothing seems to work .... the data in my keywords.html file is still from the previous url ... eventhoush the rest of the files -- fileKeyword.html and keywords.txt are updated.

I am not sure why is that?
0
sdesarAuthor Commented:
Edited text of question.
0
logiqueCommented:
Add <meta http-equiv=refresh content=5> and <meta http-equiv=plasma content=no-cache> to prevent cache-ing your old data in to the local computer.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
sdesarAuthor Commented:
I tried your suggestion...Nothing seems to work... I think there maybe a Bug with the last part of the above script- ie create anchors to the keywords...
you can check out the behavior at   http://jbh3-1.csci.csusb.edu/public/grad/sdesar/url_bold.cgi  -- check the keywords ascii and keywords with anchors files.... once you enter 2 different urls.
and check out the behavior...


the files have different data... and they should have same keywords only difference is keywords.html has anchors created.

the script can be view at http://jbh3-1.csci.csusb.edu/public/grad/sdesar/url_bold.pl

awaiting a response
Thanks
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.