Solved

Perl Noob email question

Posted on 2011-02-20
25
844 Views
Last Modified: 2012-05-11
Hello,
I am new to programming in Perl.
Any help would be appreciated as I am sure what I am trying to do is simple for most programmers which I am not ;)
I am scraping data off a web page then extracting the relevant information for a report using regex.
The target data is in an array, I need to format it in uniform columns and send it in an email.
I am attempting to use 'sprintf' to format the columns and saving it in a string.
It only sends the the last line and I lose any formatting.
Thanks in advance
my $output;
for (my $i = 0; $i <= $#data3; $i++)
{
$output = sprintf ("%s", "$data3[$i]");
print "$output\n";
}
 
my $mailFrom    =       'email@some.com';
my $subjectLine =       "Report";
#my $message     =        

my %mail = ( To      => $mailTo,
             From    => $mailFrom,
             Subject => $subjectLine,
             Message => $output,   
             'Content-Type' => 'text/html; charset="iso-8859-1"'  
            );
sendmail %mail)

Open in new window

0
Comment
Question by:fac66
  • 10
  • 9
  • 6
25 Comments
 
LVL 7

Expert Comment

by:MrNed
Comment Utility
Well in your loop you are overwriting $output every time, where you really want to append like so:

$output .= sprintf ("%s", "$data3[$i]");
0
 

Author Comment

by:fac66
Comment Utility
That seemed to work thank you!

However, all email formatting is lost.
The email has no line breaks, it's all jumbled together.
How do I keep the formatting?
0
 
LVL 7

Expert Comment

by:MrNed
Comment Utility
You're sending it in html format so you need html tags in your message body, a simple example to put each data on a new line:

$output = sprintf ("%s<br>", "$data3[$i]");
0
 

Author Comment

by:fac66
Comment Utility
Again thank you! That worked.

One last thing I hope ;)

How do I space the columns?
I've tried:
$output .= sprintf ("%-20s<br>", "$data3[$i]");
$output .= sprintf ("%20s<br>", "$data3[$i]");

Do I need to incorporate HTML tags for spacing also?
0
 
LVL 7

Expert Comment

by:MrNed
Comment Utility
Yes, HTML will compact all whitespace into a single space so you should use tables or div/span tags as you prefer. Alternatively dont use HTML email and it will keep all your spaces.
0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
Alternately, you could put <pre>...</pre> around your data (which also would not need <br> then) which tells HTML that it is pre-formatted data.
0
 

Author Comment

by:fac66
Comment Utility
Not sure I follow, where in the code would I add those tags?
0
 

Author Comment

by:fac66
Comment Utility
I put it here and it worked:
for (my $i = 0; $i <= $#data3; $i++)
{
$output .= sprintf ('%20s', "$data3[$i]");
print "<pre>$output\n</pre>";
}

But there still is no spacing, my goal is to make the columns uniform.
0
 
LVL 7

Expert Comment

by:MrNed
Comment Utility
Try this

$output = "<table>";
for (my $i = 0; $i <= $#data3; $i++)
{
$output .= sprintf ('<tr><td>%s</td></tr>', "$data3[$i]");
}
$output .= "</table>";
0
 
LVL 7

Expert Comment

by:MrNed
Comment Utility
Or:

$output = "<table><tr>";
for (my $i = 0; $i <= $#data3; $i++)
{
$output .= sprintf ('<td>%s</td>', "$data3[$i]");
}
$output .= "</tr></table>";

With only one loop, one row of data?
0
 

Author Comment

by:fac66
Comment Utility
Thank you for your persistence.

This is the email, it prints the table tags.

<table><tr><td>
  1 -- 18% Just Go with It  $30.5M $60.8M</td></tr><tr><td>
  2 -- 68% Justin Bieber: Never Say Never  $29.5M $48.5M</td></tr><tr><td>
  3 -- 54% Gnomeo and Juliet  $25.4M $50.4M</td></tr><tr><td>
  4 -- 36% The Eagle  $8.7M $15.1M</td></tr><tr><td>
  5 1 6% The Roommate  $8.1M $32.7M</td></tr><tr><td>
  6 4 95% The King's Speech  $7.2M $103.3M</td></tr><tr><td>
  7 3 51% No Strings Attached  $5.8M $66.0M</td></tr><tr><td>
  8 2 31% Sanctum  $5.7M $21.9M</td></tr><tr><td>
0
 

Author Comment

by:fac66
Comment Utility
This is what i am after from the email report.

Data scraped from http://www.rottentomatoes.com/movies/box_office.php


##  ##  Movie Title                          Weekend       Cume  T-Meter

1   --  Just Go with It                       $30.5M     $30.5M      18%
2   --  Justin Bieber: Never Say Never        $29.5M     $30.3M      68%
3   --  Gnomeo and Juliet                     $25.4M     $25.4M      53%
4   --  The Eagle                              $8.7M      $8.7M      36%
5   1   The Roommate                           $8.1M     $25.8M       6%
6   4   The King's Speech                      $7.2M     $93.7M      95%
7   3   No Strings Attached                    $5.8M     $60.0M      51%
8   2   Sanctum                                $5.7M     $18.0M      31%
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 7

Expert Comment

by:MrNed
Comment Utility
It looks like each of your data3 fields store the whole line, not each figure separately? You will need to split them out to format them the way you want. Check out this tutorial for how to do that in perl.

You said before that the <br> tags formatted it correctly? If so I don't understand why the <table> tags don't work.
0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
I'm not sure why <pre>$output</pre> wouldn't work.  I just did a test and <pre> definitely should retain spacing.  If you look at the raw text of the email, do you see the spacing between columns?  What browser/email program are you using to look at the output of this program?

MrNed's solution of using html tables should also work.
0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
Ah.  I should have refreshed.  As MrNed said, it looks like each item in $data3 contains the whole line and not just a column.

This should work....
my $output;
foreach my $line (@data3) {
    my @flds = ($line =~ m{^\s*(\d+)\s+(\S+)\s+(.*?)\s+(\S+)\s+(\S+)\s+(\S+)\s*$});
    die "could not parse line: $line" unless (@flds == 6);
    $output .= sprintf("%02d  %02d  %-20s  %-8s  %-8s  %4s\n", @flds);
    # alternately to use html tables, comment out above and uncomment below line
    #$output .= '<tr><td>' . join('</td><td>', @flds) . "</td></tr>\n";
}
$output = "<pre>$output</pre>";
# again, to use html tables, comment out above and uncomment below line
#$output = "<table>$output</table>";
print $output;
 
my $mailFrom    =       'email@some.com';
my $subjectLine =       "Report";
#my $message     =        

my %mail = ( To      => $mailTo,
             From    => $mailFrom,
             Subject => $subjectLine,
             Message => $output,   
             'Content-Type' => 'text/html; charset="iso-8859-1"'  
            );
sendmail %mail)

Open in new window

0
 

Author Comment

by:fac66
Comment Utility
That is really close!
I just need to tweek it a little.. thanks you!
I need some sleep :)

01  00  18% Just Go with      It        $30.5M    $60.8M
02  00  68% Justin Bieber: Never Say  Never     $29.5M    $48.5M
03  00  54% Gnomeo and        Juliet    $25.4M    $50.4M
04  00  36% The               Eagle     $8.7M     $15.1M
05  01  6% The                Roommate  $8.1M     $32.7M
06  04  95% The King's        Speech    $7.2M     $103.3M
07  03  51% No Strings        Attached  $5.8M     $66.0M
08  02  31%                   Sanctum   $5.7M     $21.9M
09  08  95% True              Grit      $3.8M     $164.1M
10  05  46% The Green         Hornet    $3.7M     $95.1M
11  06  17% The               Rite      $3.3M     $31.4M
12  07  52% The               Mechanic  $3.2M     $27.9M
13  11  90% The               Fighter   $2.2M     $87.9M
14  09  88% Black             Swan      $2.1M     $101.5M
15  10  21% The               Dilemma   $1.0M     $48.2M
0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
I missed that the % was the 3rd column in the input but you want it displayed as the last column...

I think this will do it.  If not, I likely typo'd something since I'm getting pretty bleary-eyes (off to bed as soon as I send this)...
my $output;
foreach my $line (@data3) {
    my @flds = ($line =~ m{^\s*(\d+)\s+(\S+)\s+(\S+)\s+(.*?)\s+(\S+)\s+(\S+)\s*$});
    die "could not parse line: $line" unless (@flds == 6);
    push @flds, splice(@flds, 2, 1);
    $output .= sprintf("%02d  %02d  %-20s  %-8s  %-8s  %4s\n", @flds);
    # alternately to use html tables, comment out above and uncomment below line
    #$output .= '<tr><td>' . join('</td><td>', @flds) . "</td></tr>\n";
}
$output = "<pre>$output</pre>";
# again, to use html tables, comment out above and uncomment below line
#$output = "<table>$output</table>";
print $output;
 
my $mailFrom    =       'email@some.com';
my $subjectLine =       "Report";
#my $message     =        

my %mail = ( To      => $mailTo,
             From    => $mailFrom,
             Subject => $subjectLine,
             Message => $output,   
             'Content-Type' => 'text/html; charset="iso-8859-1"'  
            );
sendmail %mail)

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
If you want the previous week ranking to be -- instead of 00, change the second %02d in the sprintf to be %02s instead.
0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
And if you want the numbers to display as single digit rather than 0-padded, just change %02d to %2d in the sprintf.
0
 

Author Comment

by:fac66
Comment Utility
Thanks for all your help guys, the alignment formatting is there however, what is the best way to move a field?
I am trying to move the 'T-meter' filed to the end.
I have tried 'S///, but lose all formatting and
 
Here is current formatting in the email received:

Data scraped from http://www.rottentomatoes.com/movies/box_office.php"

Cur Prev T-Meter  Movie Title                                          Weekend    Cume
 1   --     56%        Unknown                                            $21.8M     $21.8M
 2   --     29%        I Am Number Four                               $19.5M     $19.5M
 3    3     54%       Gnomeo and Juliet                               $19.4M     $50.4M
 4    1     18%       Just Go with It                                     $18.2M     $60.8M

my $output;
foreach my $line (@data3) {
    my @flds = ($line =~ m{^\s*(\d+)\s+(\S+)\s+(\S+)\s+(.*?)\s+(\S+)\s+(\S+)\s*$});
    die "could not parse line: $line" unless (@flds == 6);
     $output .= sprintf("%2d  %3s  %5s  %-35.35s  %20s  %9s\n", @flds);
    # alternately to use html tables, comment out above and uncomment below line
    #$output .= '<tr><td>' . join('</td><td>', @flds) . "</td></tr>\n";
}
$output = "<pre>$output</pre>";
# again, to use html tables, comment out above and uncomment below line
#$output = "<table>$output</table>";
#print $output;

my $mailFrom    =       'email@some.com';
my $subjectLine =       "Weekend Box Office Report";
my $message     =     'Data scraped from <a href="http://www.rottentomatoes.com/movies/box_office.php">http://www.rottentomatoes.com/movies/box_office.php</a>"';
my $head        =   "Cur Prev T-Meter Movie Title                                      Weekend    Cume";
$head = "<pre>$head</pre>";

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
My comment 34940767 above (the last one with full code) does move the field.  All you need to do is replace your line 5 with lines 5-6 from my code above:

push @flds, splice(@flds, 2, 1);
$output .= sprintf("%02d  %02d  %-20s  %-8s  %-8s  %4s\n", @flds);

The push line is the one that moves T-Meter to the end.  The splice removes the 2nd field (t) and returns the removed string.  The push then shoves it on the end.
0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
Or this sprintf (to keep the changes I mentioned and you made):

$output .= sprintf("%2d  %2s  %-20s  %-8s  %-8s  %4s\n", @flds);
0
 

Author Comment

by:fac66
Comment Utility
My apologies I missed as I copied the code below.
Worked perfect! Thanks again you are a wiz at this.

One last piece I am struggling with, I am trying to extract just the dollar amounts from @data3:

1 -- 56% Unknown  $21.8M $21.8M
3 -- 29% I Am Number Four  $19.5M $19.5M
3 3 54% Gnomeo and Juliet  $19.4M $50.4M

Every filter I attempt brings everthing or nothing at all.
Here is one of my attempts:

my @data4 = grep /\b\s+\$(\d.*?)\b/mg, @data3;
print "@data4\n";

Then I have to compare the amounts and indicate the highest and lowest:

Biggest Debut: The Roommate (1)
Weakest Debut: Sanctum (2)
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 500 total points
Comment Utility
The problem again is that @data3 contains lines - not fields.

This should work provided I haven't made any typos.  It will tell you the biggest and weakest weeks.  If you want just debuts then you can add "if ($flds[1] eq '--')" to the end of the line assigning to $money in the loop.
use List::Util (qw(max min));

my ($output, %money);
foreach my $line (@data3) {
    my @flds = ($line =~ m{^\s*(\d+)\s+(\S+)\s+(\S+)\s+(.*?)\s+(\S+)\s+(\S+)\s*$});
    die "could not parse line: $line" unless (@flds == 6);
    # add hash element based on $ with title
    $money{&convert($flds[-2])} = $flds[3];
    push @flds, splice(@flds, 2, 1);
    $output .= sprintf("%2d  %2s  %-20s  %-8s  %-8s  %4s\n", @flds);
    # alternately to use html tables, comment out above and uncomment below line
    #$output .= '<tr><td>' . join('</td><td>', @flds) . "</td></tr>\n";
}
$output = "<pre>$output</pre>";
# again, to use html tables, comment out above and uncomment below line
#$output = "<table>$output</table>";
print $output;

print "Biggest Week: ", $money{max(keys %money)}, "\n";
print "Weakest Week: ", $money{min(keys %money)}, "\n";
 
my $mailFrom    =       'email@some.com';
my $subjectLine =       "Report";
#my $message     =        

my %mail = ( To      => $mailTo,
             From    => $mailFrom,
             Subject => $subjectLine,
             Message => $output,   
             'Content-Type' => 'text/html; charset="iso-8859-1"'  
            );
sendmail %mail)

sub convert {
    my ($str) = @_;
    # assuming it will always be M or K (not B or <K)
    unless ($str =~ m{^\$([\d\.]+)([MK])?$}) {
        die "could not convert $str into amount\n";
    }
    my ($amt, $mod) = ($1, $2);
    return +($mod eq 'M') ? $amt*1000000 : $amt*1000;
}

Open in new window

0
 

Author Comment

by:fac66
Comment Utility
Thank you for your help.
Much appreciated!
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This video discusses moving either the default database or any database to a new volume.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now