Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Need some assistance modifying a PERL script

Posted on 2011-09-03
9
Medium Priority
?
388 Views
Last Modified: 2012-05-12
Hi Team,

   I would just like to seek some help modifying a script already created for me by one of our valuable members (wilcoxon)

  The script basically parses a raw syslog file and breaks them into different fields enclosed by quotes.  The original script is as follows

#!/usr/bin/perl
use strict;
use warnings;

# you need to replace the files in the open lines with the real ones
open IN, 'Syslog' or die "could not read syslogs: $!";
open OUT, '>syslog.csv' or die "could not write syslog.csv: $!";
while (<IN>) {
    chomp;
    s{"}{'}g; # replace " with ' just to make life easier
    if (m{^(\w+)\s+(\d+)\s+([\d:]+)\s+(\S+)\s+(.*)$}) {
        print OUT join(',', map { '"' . $_ . '"' } $1, $2, $3, $4, $5), "\n";
    } else {
        die "could not parse line $.:\n$_";
    }
}

Open in new window


the output of the script would look like this
"Sep","1","21:52:33","10.22.14.8","local/LB1 notice: 011ae020:5: Connection in progress to 10.9.8.8  "
"Sep","1","21:52:33","10.4.174.67","local/LB2 notice  011ae020:5: Connection in progress to 10.9.8.9"

We now have a new requirement to have the Months field changed to their two digit nnumeric equivalent, e.g. 01 for Jan, 02 for Feb....09 For Sep.....etc.  Also, the double quotes should be omitted.

Using the same two lines above, the final output should be like this:


09, 1, 21:52:33, 10.22.14.8, local/LB1 notice: 011ae020:5: Connection in progress to 10.9.8.8  
09, 1, 21:52:33, 10.4.174.67, local/LB2 notice  011ae020:5: Connection in progress to 10.9.8.9

May I just request help on how the original script should be modified to achieve my intended outcome?

Thanks very much.


0
Comment
Question by:rleyba828
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
9 Comments
 
LVL 24

Expert Comment

by:mankowitz
ID: 36479736
First, to get rid of the quotes, remove the map function
Second... use this hash to map month names to numbers

%mon2num = qw(
  jan 1  feb 2  mar 3  apr 4  may 5  jun 6  jul 7  aug 8  sep 9  oct 10 nov 11 dec 12
);

 
#!/usr/bin/perl
use strict;
use warnings;

# you need to replace the files in the open lines with the real ones
open IN, 'Syslog' or die "could not read syslogs: $!";
open OUT, '>syslog.csv' or die "could not write syslog.csv: $!";
while (<IN>) {
    chomp;
    s{"}{'}g; # replace " with ' just to make life easier
    if (m{^(\w+)\s+(\d+)\s+([\d:]+)\s+(\S+)\s+(.*)$}) {
        print OUT join(',', $mon2num{lc $1}, $2, $3, $4, $5), "\n";
    } else {
        die "could not parse line $.:\n$_";
    }
}

Open in new window

0
 

Author Comment

by:rleyba828
ID: 36479776
Hi mankowitz,

   Thanks for this......sorry I am not a linux expert....but where do you declare the %mon2num variable?

0
 
LVL 28

Assisted Solution

by:FishMonger
FishMonger earned 800 total points
ID: 36480614
It would probably be better to use the Text::CSV module for this, but here's the adjustments to your script that I'd make without using the module.  A couple of the adjustments were made to bring it more in line with PBP (Perl Best Practices).

#!/usr/bin/perl

use strict;
use warnings;

my %mon2num = (Jan => '01', Feb => '02', Mar => '03', Apr => '04',
               May => '05', Jun => '06', Jul => '07', Aug => '08',
               Sep => '09', Oct => '10', Nov => '11', Dec => '12');

# you need to replace the files in the open lines with the real ones
open my $syslog_fh, ',', 'Syslog' or die "could not read syslogs: $!";
open my $csvlog_fh, '>', 'syslog.csv' or die "could not write syslog.csv: $!";

while ( my $line = <$syslog_fh> ) {
    chomp $line;
    $line =~ tr/"/'/; # replace " with ' just to make life easier
    if ($line =~ m{^(\w+)\s+(\d+)\s+([\d:]+)\s+(\S+)\s+(.*)$}) {
        print $csvlog_fh join(',', $mon2num{$1}, $2, $3, $4, $5), "\n";
    } else {
        die "could not parse line $.:\n$_";
    }
}
close $syslog_fh;
close $csvlog_fh;

Open in new window

0
Are your AD admin tools letting you down?

Managing Active Directory can get complicated.  Often, the native tools for managing AD are just not up to the task.  The largest Active Directory installations in the world have relied on one tool to manage their day-to-day administration tasks: Hyena. Start your trial today.

 
LVL 24

Accepted Solution

by:
mankowitz earned 1200 total points
ID: 36481328
oops. you can define it anywhere you want:
#!/usr/bin/perl
use strict;
use warnings;

%mon2num = qw(
  jan 1  feb 2  mar 3  apr 4  may 5  jun 6  jul 7  aug 8  sep 9  oct 10 nov 11 dec 12
);

# you need to replace the files in the open lines with the real ones
open IN, 'Syslog' or die "could not read syslogs: $!";
open OUT, '>syslog.csv' or die "could not write syslog.csv: $!";
while (<IN>) {
    chomp;
    s{"}{'}g; # replace " with ' just to make life easier
    if (m{^(\w+)\s+(\d+)\s+([\d:]+)\s+(\S+)\s+(.*)$}) {
        print OUT join(',', $mon2num{lc $1}, $2, $3, $4, $5), "\n";
    } else {
        die "could not parse line $.:\n$_";
    }
}

Open in new window

0
 
LVL 28

Expert Comment

by:FishMonger
ID: 36481376
Assigning to a hash in that manor is poor style.  It's better to use the fat comma, and you forgot the 'my' keyword.  Also, the OP wants 2 digit format for the month, which you're not accounting for the months 1..9.

My example has a minor typo on opening the $syslog_fh filehandle.  It should read as:
open my $syslog_fh, '<', 'Syslog' or die "could not read syslogs: $!";

Open in new window

0
 
LVL 24

Expert Comment

by:mankowitz
ID: 36485682
@fish

1. Assigning a hash in a way that works is always good style in perl.
2. For a script that's only 20 lines, I don't really need to worry about variable scope. I don't think the my keyword will make much of a difference here.
3. You're right about the 2 digit v. 3 digit format.
4. You wrote manor, but I think you mean manner. In perl, as in human languages, spelling is important.
#!/usr/bin/perl
use strict;
use warnings;

%mon2num = qw(jan 01 feb 02 mar 03 apr 04 may 05 jun 06 jul 07 aug 08 sep 09 oct 10 nov 11 dec 12);

# you need to replace the files in the open lines with the real ones
open IN, 'Syslog' or die "could not read syslogs: $!";
open OUT, '>syslog.csv' or die "could not write syslog.csv: $!";
while (<IN>) {
    chomp;
    s{"}{'}g; # replace " with ' just to make life easier
    if (m{^(\w+)\s+(\d+)\s+([\d:]+)\s+(\S+)\s+(.*)$}) {
        print OUT join(',', $mon2num{lc $1}, $2, $3, $4, $5), "\n";
    } else {
        die "could not parse line $.:\n$_";
    }
}

Open in new window

0
 
LVL 28

Expert Comment

by:FishMonger
ID: 36489458
mankowitz,

1) I never said that your hash assignment wouldn't work, but it is poor style, in part because it's fairly easy to use an odd number of elements and if you have a lengthy assignment, it can be difficult to determine what is missing.

2) In this case scoping is not an issue, but 20 line scripts often get expanded or inserted into much larger scripts where scoping will be an issue.

3) The my keyword is important and will make a big difference here.  Based on your assertion that it wouldn't I can only assume that you didn't test your script.  Did you know that it won't compile?

4) The use of bareword filehandles is discouraged in today's Perl coding standards.  It is best practice to use a lexical var for the filehandle and the 3 arg form of open.

5) It's also best practice to keep line lengths below 80 characters, some would say no more than 72.

6) Thank you for pointing out my spelling error.  I try to learn from my mistakes.
0
 
LVL 24

Expert Comment

by:mankowitz
ID: 36492631
@fish-

touche.

thanks for keeping me humble
0
 

Author Comment

by:rleyba828
ID: 36517844
Hi Team,

  Sorry for the delay in replying and  thanks very much for your tips.  My final script below is based on your suggestions above.

#!/usr/bin/perl
use strict;
use warnings;

my %mon2num = qw(
  jan 01  feb 02  mar 03  apr 04  may 05  jun 06  jul 07  aug 08  sep 09  oct 10 nov 11 dec 12
);

# you need to replace the files in the open lines with the real ones
open IN, 'Syslog' or die "could not read syslogs: $!";
open OUT, '>syslog.csv' or die "could not write syslog.csv: $!";
while (<IN>) {
    chomp;
    s{"}{'}g; # replace " with ' just to make life easier
    if (m{^(\w+)\s+(\d+)\s+([\d:]+)\s+(\S+)\s+(.*)$}) {
        print OUT join(',', $mon2num{lc $1}, $2, $3, $4, $5), "\n";
    } else {
        die "could not parse line $.:\n$_";
    }
}

Open in new window

0

Featured Post

Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A quick Powershell script I wrote to find old program installations and check versions of a specific file across the network.
There are times when we need to generate a report on the inbox rules, where users have set up forwarding externally in their mailbox. In this article, I will be sharing a script I wrote to generate the report in CSV format.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question