Solved

Perl Noob email question

Posted on 2011-02-20
25
853 Views
Last Modified: 2012-05-11
Hello,
I am new to programming in Perl.
Any help would be appreciated as I am sure what I am trying to do is simple for most programmers which I am not ;)
I am scraping data off a web page then extracting the relevant information for a report using regex.
The target data is in an array, I need to format it in uniform columns and send it in an email.
I am attempting to use 'sprintf' to format the columns and saving it in a string.
It only sends the the last line and I lose any formatting.
Thanks in advance
my $output;
for (my $i = 0; $i <= $#data3; $i++)
{
$output = sprintf ("%s", "$data3[$i]");
print "$output\n";
}
 
my $mailFrom    =       'email@some.com';
my $subjectLine =       "Report";
#my $message     =        

my %mail = ( To      => $mailTo,
             From    => $mailFrom,
             Subject => $subjectLine,
             Message => $output,   
             'Content-Type' => 'text/html; charset="iso-8859-1"'  
            );
sendmail %mail)

Open in new window

0
Comment
Question by:fac66
  • 10
  • 9
  • 6
25 Comments
 
LVL 7

Expert Comment

by:MrNed
ID: 34940407
Well in your loop you are overwriting $output every time, where you really want to append like so:

$output .= sprintf ("%s", "$data3[$i]");
0
 

Author Comment

by:fac66
ID: 34940471
That seemed to work thank you!

However, all email formatting is lost.
The email has no line breaks, it's all jumbled together.
How do I keep the formatting?
0
 
LVL 7

Expert Comment

by:MrNed
ID: 34940481
You're sending it in html format so you need html tags in your message body, a simple example to put each data on a new line:

$output = sprintf ("%s<br>", "$data3[$i]");
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:fac66
ID: 34940501
Again thank you! That worked.

One last thing I hope ;)

How do I space the columns?
I've tried:
$output .= sprintf ("%-20s<br>", "$data3[$i]");
$output .= sprintf ("%20s<br>", "$data3[$i]");

Do I need to incorporate HTML tags for spacing also?
0
 
LVL 7

Expert Comment

by:MrNed
ID: 34940512
Yes, HTML will compact all whitespace into a single space so you should use tables or div/span tags as you prefer. Alternatively dont use HTML email and it will keep all your spaces.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34940538
Alternately, you could put <pre>...</pre> around your data (which also would not need <br> then) which tells HTML that it is pre-formatted data.
0
 

Author Comment

by:fac66
ID: 34940550
Not sure I follow, where in the code would I add those tags?
0
 

Author Comment

by:fac66
ID: 34940570
I put it here and it worked:
for (my $i = 0; $i <= $#data3; $i++)
{
$output .= sprintf ('%20s', "$data3[$i]");
print "<pre>$output\n</pre>";
}

But there still is no spacing, my goal is to make the columns uniform.
0
 
LVL 7

Expert Comment

by:MrNed
ID: 34940587
Try this

$output = "<table>";
for (my $i = 0; $i <= $#data3; $i++)
{
$output .= sprintf ('<tr><td>%s</td></tr>', "$data3[$i]");
}
$output .= "</table>";
0
 
LVL 7

Expert Comment

by:MrNed
ID: 34940588
Or:

$output = "<table><tr>";
for (my $i = 0; $i <= $#data3; $i++)
{
$output .= sprintf ('<td>%s</td>', "$data3[$i]");
}
$output .= "</tr></table>";

With only one loop, one row of data?
0
 

Author Comment

by:fac66
ID: 34940604
Thank you for your persistence.

This is the email, it prints the table tags.

<table><tr><td>
  1 -- 18% Just Go with It  $30.5M $60.8M</td></tr><tr><td>
  2 -- 68% Justin Bieber: Never Say Never  $29.5M $48.5M</td></tr><tr><td>
  3 -- 54% Gnomeo and Juliet  $25.4M $50.4M</td></tr><tr><td>
  4 -- 36% The Eagle  $8.7M $15.1M</td></tr><tr><td>
  5 1 6% The Roommate  $8.1M $32.7M</td></tr><tr><td>
  6 4 95% The King's Speech  $7.2M $103.3M</td></tr><tr><td>
  7 3 51% No Strings Attached  $5.8M $66.0M</td></tr><tr><td>
  8 2 31% Sanctum  $5.7M $21.9M</td></tr><tr><td>
0
 

Author Comment

by:fac66
ID: 34940610
This is what i am after from the email report.

Data scraped from http://www.rottentomatoes.com/movies/box_office.php


##  ##  Movie Title                          Weekend       Cume  T-Meter

1   --  Just Go with It                       $30.5M     $30.5M      18%
2   --  Justin Bieber: Never Say Never        $29.5M     $30.3M      68%
3   --  Gnomeo and Juliet                     $25.4M     $25.4M      53%
4   --  The Eagle                              $8.7M      $8.7M      36%
5   1   The Roommate                           $8.1M     $25.8M       6%
6   4   The King's Speech                      $7.2M     $93.7M      95%
7   3   No Strings Attached                    $5.8M     $60.0M      51%
8   2   Sanctum                                $5.7M     $18.0M      31%
0
 
LVL 7

Expert Comment

by:MrNed
ID: 34940634
It looks like each of your data3 fields store the whole line, not each figure separately? You will need to split them out to format them the way you want. Check out this tutorial for how to do that in perl.

You said before that the <br> tags formatted it correctly? If so I don't understand why the <table> tags don't work.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34940638
I'm not sure why <pre>$output</pre> wouldn't work.  I just did a test and <pre> definitely should retain spacing.  If you look at the raw text of the email, do you see the spacing between columns?  What browser/email program are you using to look at the output of this program?

MrNed's solution of using html tables should also work.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34940680
Ah.  I should have refreshed.  As MrNed said, it looks like each item in $data3 contains the whole line and not just a column.

This should work....
my $output;
foreach my $line (@data3) {
    my @flds = ($line =~ m{^\s*(\d+)\s+(\S+)\s+(.*?)\s+(\S+)\s+(\S+)\s+(\S+)\s*$});
    die "could not parse line: $line" unless (@flds == 6);
    $output .= sprintf("%02d  %02d  %-20s  %-8s  %-8s  %4s\n", @flds);
    # alternately to use html tables, comment out above and uncomment below line
    #$output .= '<tr><td>' . join('</td><td>', @flds) . "</td></tr>\n";
}
$output = "<pre>$output</pre>";
# again, to use html tables, comment out above and uncomment below line
#$output = "<table>$output</table>";
print $output;
 
my $mailFrom    =       'email@some.com';
my $subjectLine =       "Report";
#my $message     =        

my %mail = ( To      => $mailTo,
             From    => $mailFrom,
             Subject => $subjectLine,
             Message => $output,   
             'Content-Type' => 'text/html; charset="iso-8859-1"'  
            );
sendmail %mail)

Open in new window

0
 

Author Comment

by:fac66
ID: 34940707
That is really close!
I just need to tweek it a little.. thanks you!
I need some sleep :)

01  00  18% Just Go with      It        $30.5M    $60.8M
02  00  68% Justin Bieber: Never Say  Never     $29.5M    $48.5M
03  00  54% Gnomeo and        Juliet    $25.4M    $50.4M
04  00  36% The               Eagle     $8.7M     $15.1M
05  01  6% The                Roommate  $8.1M     $32.7M
06  04  95% The King's        Speech    $7.2M     $103.3M
07  03  51% No Strings        Attached  $5.8M     $66.0M
08  02  31%                   Sanctum   $5.7M     $21.9M
09  08  95% True              Grit      $3.8M     $164.1M
10  05  46% The Green         Hornet    $3.7M     $95.1M
11  06  17% The               Rite      $3.3M     $31.4M
12  07  52% The               Mechanic  $3.2M     $27.9M
13  11  90% The               Fighter   $2.2M     $87.9M
14  09  88% Black             Swan      $2.1M     $101.5M
15  10  21% The               Dilemma   $1.0M     $48.2M
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34940767
I missed that the % was the 3rd column in the input but you want it displayed as the last column...

I think this will do it.  If not, I likely typo'd something since I'm getting pretty bleary-eyes (off to bed as soon as I send this)...
my $output;
foreach my $line (@data3) {
    my @flds = ($line =~ m{^\s*(\d+)\s+(\S+)\s+(\S+)\s+(.*?)\s+(\S+)\s+(\S+)\s*$});
    die "could not parse line: $line" unless (@flds == 6);
    push @flds, splice(@flds, 2, 1);
    $output .= sprintf("%02d  %02d  %-20s  %-8s  %-8s  %4s\n", @flds);
    # alternately to use html tables, comment out above and uncomment below line
    #$output .= '<tr><td>' . join('</td><td>', @flds) . "</td></tr>\n";
}
$output = "<pre>$output</pre>";
# again, to use html tables, comment out above and uncomment below line
#$output = "<table>$output</table>";
print $output;
 
my $mailFrom    =       'email@some.com';
my $subjectLine =       "Report";
#my $message     =        

my %mail = ( To      => $mailTo,
             From    => $mailFrom,
             Subject => $subjectLine,
             Message => $output,   
             'Content-Type' => 'text/html; charset="iso-8859-1"'  
            );
sendmail %mail)

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34940770
If you want the previous week ranking to be -- instead of 00, change the second %02d in the sprintf to be %02s instead.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34940772
And if you want the numbers to display as single digit rather than 0-padded, just change %02d to %2d in the sprintf.
0
 

Author Comment

by:fac66
ID: 34944310
Thanks for all your help guys, the alignment formatting is there however, what is the best way to move a field?
I am trying to move the 'T-meter' filed to the end.
I have tried 'S///, but lose all formatting and
 
Here is current formatting in the email received:

Data scraped from http://www.rottentomatoes.com/movies/box_office.php"

Cur Prev T-Meter  Movie Title                                          Weekend    Cume
 1   --     56%        Unknown                                            $21.8M     $21.8M
 2   --     29%        I Am Number Four                               $19.5M     $19.5M
 3    3     54%       Gnomeo and Juliet                               $19.4M     $50.4M
 4    1     18%       Just Go with It                                     $18.2M     $60.8M

my $output;
foreach my $line (@data3) {
    my @flds = ($line =~ m{^\s*(\d+)\s+(\S+)\s+(\S+)\s+(.*?)\s+(\S+)\s+(\S+)\s*$});
    die "could not parse line: $line" unless (@flds == 6);
     $output .= sprintf("%2d  %3s  %5s  %-35.35s  %20s  %9s\n", @flds);
    # alternately to use html tables, comment out above and uncomment below line
    #$output .= '<tr><td>' . join('</td><td>', @flds) . "</td></tr>\n";
}
$output = "<pre>$output</pre>";
# again, to use html tables, comment out above and uncomment below line
#$output = "<table>$output</table>";
#print $output;

my $mailFrom    =       'email@some.com';
my $subjectLine =       "Weekend Box Office Report";
my $message     =     'Data scraped from <a href="http://www.rottentomatoes.com/movies/box_office.php">http://www.rottentomatoes.com/movies/box_office.php</a>"';
my $head        =   "Cur Prev T-Meter Movie Title                                      Weekend    Cume";
$head = "<pre>$head</pre>";

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34944386
My comment 34940767 above (the last one with full code) does move the field.  All you need to do is replace your line 5 with lines 5-6 from my code above:

push @flds, splice(@flds, 2, 1);
$output .= sprintf("%02d  %02d  %-20s  %-8s  %-8s  %4s\n", @flds);

The push line is the one that moves T-Meter to the end.  The splice removes the 2nd field (t) and returns the removed string.  The push then shoves it on the end.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34944395
Or this sprintf (to keep the changes I mentioned and you made):

$output .= sprintf("%2d  %2s  %-20s  %-8s  %-8s  %4s\n", @flds);
0
 

Author Comment

by:fac66
ID: 34944924
My apologies I missed as I copied the code below.
Worked perfect! Thanks again you are a wiz at this.

One last piece I am struggling with, I am trying to extract just the dollar amounts from @data3:

1 -- 56% Unknown  $21.8M $21.8M
3 -- 29% I Am Number Four  $19.5M $19.5M
3 3 54% Gnomeo and Juliet  $19.4M $50.4M

Every filter I attempt brings everthing or nothing at all.
Here is one of my attempts:

my @data4 = grep /\b\s+\$(\d.*?)\b/mg, @data3;
print "@data4\n";

Then I have to compare the amounts and indicate the highest and lowest:

Biggest Debut: The Roommate (1)
Weakest Debut: Sanctum (2)
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 500 total points
ID: 34946329
The problem again is that @data3 contains lines - not fields.

This should work provided I haven't made any typos.  It will tell you the biggest and weakest weeks.  If you want just debuts then you can add "if ($flds[1] eq '--')" to the end of the line assigning to $money in the loop.
use List::Util (qw(max min));

my ($output, %money);
foreach my $line (@data3) {
    my @flds = ($line =~ m{^\s*(\d+)\s+(\S+)\s+(\S+)\s+(.*?)\s+(\S+)\s+(\S+)\s*$});
    die "could not parse line: $line" unless (@flds == 6);
    # add hash element based on $ with title
    $money{&convert($flds[-2])} = $flds[3];
    push @flds, splice(@flds, 2, 1);
    $output .= sprintf("%2d  %2s  %-20s  %-8s  %-8s  %4s\n", @flds);
    # alternately to use html tables, comment out above and uncomment below line
    #$output .= '<tr><td>' . join('</td><td>', @flds) . "</td></tr>\n";
}
$output = "<pre>$output</pre>";
# again, to use html tables, comment out above and uncomment below line
#$output = "<table>$output</table>";
print $output;

print "Biggest Week: ", $money{max(keys %money)}, "\n";
print "Weakest Week: ", $money{min(keys %money)}, "\n";
 
my $mailFrom    =       'email@some.com';
my $subjectLine =       "Report";
#my $message     =        

my %mail = ( To      => $mailTo,
             From    => $mailFrom,
             Subject => $subjectLine,
             Message => $output,   
             'Content-Type' => 'text/html; charset="iso-8859-1"'  
            );
sendmail %mail)

sub convert {
    my ($str) = @_;
    # assuming it will always be M or K (not B or <K)
    unless ($str =~ m{^\$([\d\.]+)([MK])?$}) {
        die "could not convert $str into amount\n";
    }
    my ($amt, $mod) = ($1, $2);
    return +($mod eq 'M') ? $amt*1000000 : $amt*1000;
}

Open in new window

0
 

Author Comment

by:fac66
ID: 34947602
Thank you for your help.
Much appreciated!
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
crawling gofundme 4 145
PERL variable conundrum 9 95
Transforming a Soap message to a simple xml message! 10 155
Create an automated page index 9 84
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

733 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question