Solved

Memory or CPU cycles?

Posted on 2001-07-23
4
210 Views
Last Modified: 2010-03-05
I am asking for an opinion here...

I am trying to decide what is the best way to output data parsed from a database into an HTML file...

Say, I have a database with a bunch of articles, which I store in an advanced structure of hashtables and stuff..and I have 2 HTML files, one with the body of the page, and one which will be iterated for every article within the body file (called article file).

Here's how it would look:
<pre>
foreach (@array_containing_body_file)
{
  $_ =~ s/<content_tag>/&parser_subroutine(@array_containing_article_file)/eg;
  print $_;
}

sub parser_subroutine
{
  [foobar...]
  foreach (@temp_array_containing_article_file)
  {
     [parse_tags_and_things]
     print $_
  }
}
</pre>
In this case, everything is printed right away, however if the "content_tag" exists more then once in the body file, then the parser_subroutine will be called and executed several times...

-=-=

The other option is to dump all data from parser_subroutine into a temprorary variable that is fetched BEFORE the FIRST foreach loop, thus preventing the parser_sub running more then once.

<pre>
$data = &parser_subroutine(@array_containing_article_file);
foreach (@array_containing_body_file)
{
  $_ =~ s/<content_tag>/$data/eg;
  print $_;
}

sub parser_subroutine
{
  [foobar...]
  my $data;
  foreach (@temp_array_containing_article_file)
  {
     [parse_tags_and_things]
     $data .= $_;
  }
  return $data;
}
</pre>

This is an example, and the size of both the articles and HTML files can vary greatly depending on implementation...

This is more of a "comment" thing then a question, I just want to hear other people's opinions on the matter...

So feel free to bring in examples of how you do parsing and stuff :)
0
Comment
Question by:tolian
  • 2
4 Comments
 
LVL 8

Accepted Solution

by:
bebonham earned 50 total points
ID: 6310571
after reading your question and the meta code, I could not see how the one related to the other all that well,

perhaps if you post the actual code, I would understand better.

but, I will explain, based on your question, what I think is the best way.


...you have a body (shell or header and footer... choice of words)

and you have a template for the output of an item.

I assume by your meta code that this is a generic template which you parse with perl variables.


and you have I assume some sort of flat file database and mechinism for reading its data.

which you allude to in your code.


since everything is static EXCEPT for the data from the hash tables, then we need to focus on optimizing the output of the static data.

ultimately, your script will run best if all the html is in the code.

I assume the html code isn't huge, it shouldn't be.
if it is you are better off using no data structure to read in the html files, ie (while(<FILE>){$_=~s/regexp/;print $_;}) because that way you don't store the whole chunck of html in memory at once.

BUT, that is much slower, as we all know disk access is slower, why not load it all at the same time.

So,

here is my return of meta code.


$head=<<ENDHEAD;
all your html for the top and sides of the page here


ENDHEAD


$foot=<<ENDFOOT;
if you have stuff that goes below the content put it here

ENDFOOT

#begin
print "Content-type: text/html\n\n";
print $head;
foreach(@itemonthepage)
{
print "<FORMATTING_TAGS>$data</TAGS>";
#in as much detail as you want here
}
print $foot;



#####end


I believe that is the most effective way of accomplishing it.

Bob
0
 

Author Comment

by:tolian
ID: 6310673
Hi
Thanks for the comment...

Your way of doing this is pretty good, and I did it that way before. The problem is I am writing an application where users will not modify the perl code, they'll have to modify the HTML templates to adapt the application to their needs.

The meta code I had in my question would produce exactly the same output, except the bottom one would need more memory while the top one might possibly have to run the same subroutine several times (and it would produce the same output each time).

I am thinking of splitting up the body file I talked about into a TOP and BOTTOM file however, that might make things easier and more efficient.

As for not using a datastructure for the file, that is actually a pretty good idea, I'll have to do some testing to see if that works better.

Ultimately I need to store the HTML code in a separate file, I am thinking though that since the users will edit the code through a special page, that I can generate a perl file where the code that the user edits is placed into $foo<<BAR;\\nCODE\\nBAR and thus I could just require the file when the code needs to be parsed with the data... Hmm
0
 
LVL 8

Expert Comment

by:bebonham
ID: 6310909
yeah, if you can do anything conditionally, that will be good,

but as I said, if you are reading data in from files, the most efficient way I believe is

as a sub routine
sub parsef
{
$file=shift;
my $contents;
open FILE, $file;
while(<FILE>)
{
$contents.=$_;
}
close FILE;
$contents=~/modifyWithRegexp/sig;
return $contents;
}

or even better if you can print it without having to use the /s  modifier

while(<FILE>)
{
=~/regexp/;
print;
}


but if you care to give more info perhaps my advice could be more on target.

of course, we always try to make the loops as small as possible.

Bob
0
 
LVL 20

Expert Comment

by:jmcg
ID: 9493323
Nothing has happened on this question in over 12 months.
I will leave a recommendation in the Cleanup topic area that
the answer by bebonham be accepted.

Please leave any comments here within the next seven days.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

jmcg
EE Cleanup Volunteer
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now