Solved

Extract data from a file using perl

Posted on 2011-03-19
12
778 Views
Last Modified: 2012-05-11
Hi,
I want to extract some data from "file1.txt"

file1.txt :
20.03.2011 00:35:42 xxxxxxxxxxxxxxxxxxxx  yyyy yyyyyy: yyyyyyyy
20.03.2011 00:35:42 xxxxxxxxxxxxxxxxxxxx  aaaa aa a : bb bb bb b b
20.03.2011 00:35:42 xxxxxxxxxxxxxxxxxxxx  zzzzzzzz zzz : zzzzzz zzz

19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  zzzzzzzzzzzz : zzzzzz zzzz
19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  zzzzzzzzzzzz : zzzzzz zzzz
19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  zzzzzzzzzzzz : zzzzzz zzzz
19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  ccc cc c : sss ss

My output should be:

20.03.2011 00:35:42  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55  ccc cc c : sss ss

values of b's  & s's may change
Need perl script
0
Comment
Question by:pravink22
  • 5
  • 4
  • 2
  • +1
12 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 35174279
perl -lane 'next unless $F[6]eq":"; @F[2]=""; print "@F"' file1.txt
0
 

Author Comment

by:pravink22
ID: 35174294
This is my script,
********************************************
#!/usr/bin/perl
print "Enter the file name:";
$input = <STDIN>;
open (INFILE, "<$input");
open(OUTFILE, ">>out.csv");
@array = <INFILE>;
$count = 0;
while ($count <= $#array) {
if ( @array($count) = aaaa aa a) || (@array($count) = ccc cc c) {
print "date time aaaa aa a : ?????????"}
$count++
}
close(INFILE);
close(OUTFILE);
*****************************************

can u guide me now ???
0
 
LVL 84

Expert Comment

by:ozo
ID: 35174322
#!/usr/bin/perl                                                                                      
print "Enter the file name:";
$input = <STDIN>;
open (INFILE, "<$input");
open(OUTFILE, ">>out.csv");
while( <INFILE> ){
 if( /aaaa aa a(.*)/ || /ccc cc c(.*)/ ){
     print "date time aaaa aa a : $1\n";
 }
}
0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 

Author Comment

by:pravink22
ID: 35174331
Hi ozo,

I have tried but out.csv = 0kb

Please find the attachments
file1.txt
test.pl.txt
out.csv
0
 
LVL 84

Expert Comment

by:ozo
ID: 35174404
what do you want out.csv to be?
0
 

Author Comment

by:pravink22
ID: 35174406
My output should be:

20.03.2011 00:35:42  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55  ccc cc c : sss ss
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 35174412
#!/usr/bin/perl
print "Enter the file name:";
$input = <STDIN>;
open (INFILE, "<$input");
open(OUTFILE, ">>out.csv");
while( <INFILE> ){
 if( /(\S+\s+\S+).*?(aaaa aa a.*)/ || /(\S+\s+\S+).*?(ccc cc c.*)/ ){
     print OUTFILE"$1 $2\n";
 }
}
close(INFILE);
close(OUTFILE);
0
 

Author Comment

by:pravink22
ID: 35174415
Its working Great !!! :-)
0
 

Author Comment

by:pravink22
ID: 35174908
How to get an output like this

date             Time        aaaa aa a       ccc cc c
20.03.2011 00:35:42  bb bb bb b b
19.03.2011 13:40:55  bb bb bb b b
19.03.2011 13:40:55                         sss ss


0
 
LVL 31

Expert Comment

by:farzanj
ID: 35175034
Try this
my $file = 'file1.txt';

open(INFO,"<$file") or die "Could not open the file";
my @lines = <INFO>;
close(INFO);
my $filter = "yyyyy|zzzzz|^$";
my @filter = grep(! /$filter/, @lines);
print "date             Time        aaaa aa a       ccc cc c";
print @filter;

Open in new window

0
 
LVL 31

Expert Comment

by:farzanj
ID: 35175040
Sorry a slight problem in the last one. Please ignore that
 
my $file = 'file1.txt';

open(INFO,"<$file") or die "Could not open the file";
my @lines = <INFO>;
close(INFO);
my $filter = 'yyyyy|zzzzz|^$';
my @filter = grep(! /$filter/, @lines);
print "date             Time        aaaa aa a       ccc cc c";
print @filter;

Open in new window

0
 
LVL 12

Expert Comment

by:tel2
ID: 35202539
Hi pravink22,

The above message says:

"Notice: pravink22 has requested that this question be closed by accepting pravink22's comment #35174415 (0 points) as the solution for the following reason:
Its working Great !!! :-)
..."

If it works great, prevink22, then I suggest you allocate points to the expert who made it work great.

I see that *after* you said it works great (so I guess your original request had been satisfied), you changed your requirements to a format which has got some of the data ("aaaa aa a       ccc cc c") in the heading line.  This is a problem, because:
  1. An expert has already gone to the trouble of writing code to satisfy your original requirements, but that expert has not been given any points for that effort.
  2. If you keep changing your requirements, it just adds more work to the experts, who would probably not have spent any time on this question if they knew, up front, that the requirements were going to change so much.
  3. Your latest requirements don't look very logical, since they contain part of one of the detail records in the heading line.  You have not explained the rules for working out what data to put in the heading line, so an expert may be able to make it work for the test data you've provided, but it may not work for some other data, which will mean that, once again, you will not be satisfied, and will probably ask for more changes.  I'm guessing that this is the main reason you've had no recent comments from experts.
  4. Even if you had not changed your requirements, you haven't responded to farzanj's latest post.  Yes, I know he hasn't tested his code, and it doesn't meet your first or latest requirements, but you should have told him that, and given farzanj a chance to fix it.

I suggest you award points to the expert who has provided a solution which met your original requirements, and if you still require the new format, explain the logic of it clearly in a new question.  Failing that, you might be hearing from a (real) moderator soon.
0

Featured Post

DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
how do I grep for a specfic DNS record in a directory 3 45
perl script to count sepecial characters in a file 7 146
Perl Frameworks 1 90
parse a file and get data out 11 47
Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This video shows how to quickly and easily add an email signature for all users on Exchange 2016. The resulting signature is applied on a server level by Exchange Online. The email signature template has been downloaded from: www.mail-signatures…

810 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question