Solved

Extract data from a file using perl

Posted on 2011-03-19
12
819 Views
Last Modified: 2012-05-11
Hi,
I want to extract some data from "file1.txt"

file1.txt :
20.03.2011 00:35:42 xxxxxxxxxxxxxxxxxxxx  yyyy yyyyyy: yyyyyyyy
20.03.2011 00:35:42 xxxxxxxxxxxxxxxxxxxx  aaaa aa a : bb bb bb b b
20.03.2011 00:35:42 xxxxxxxxxxxxxxxxxxxx  zzzzzzzz zzz : zzzzzz zzz

19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  zzzzzzzzzzzz : zzzzzz zzzz
19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  zzzzzzzzzzzz : zzzzzz zzzz
19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  zzzzzzzzzzzz : zzzzzz zzzz
19.03.2011 13:40:55 xxxxxxxxxxxxxxxxxxxx  ccc cc c : sss ss

My output should be:

20.03.2011 00:35:42  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55  ccc cc c : sss ss

values of b's  & s's may change
Need perl script
0
Comment
Question by:pravink22
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
  • 2
  • +1
12 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 35174279
perl -lane 'next unless $F[6]eq":"; @F[2]=""; print "@F"' file1.txt
0
 

Author Comment

by:pravink22
ID: 35174294
This is my script,
********************************************
#!/usr/bin/perl
print "Enter the file name:";
$input = <STDIN>;
open (INFILE, "<$input");
open(OUTFILE, ">>out.csv");
@array = <INFILE>;
$count = 0;
while ($count <= $#array) {
if ( @array($count) = aaaa aa a) || (@array($count) = ccc cc c) {
print "date time aaaa aa a : ?????????"}
$count++
}
close(INFILE);
close(OUTFILE);
*****************************************

can u guide me now ???
0
 
LVL 84

Expert Comment

by:ozo
ID: 35174322
#!/usr/bin/perl                                                                                      
print "Enter the file name:";
$input = <STDIN>;
open (INFILE, "<$input");
open(OUTFILE, ">>out.csv");
while( <INFILE> ){
 if( /aaaa aa a(.*)/ || /ccc cc c(.*)/ ){
     print "date time aaaa aa a : $1\n";
 }
}
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:pravink22
ID: 35174331
Hi ozo,

I have tried but out.csv = 0kb

Please find the attachments
file1.txt
test.pl.txt
out.csv
0
 
LVL 84

Expert Comment

by:ozo
ID: 35174404
what do you want out.csv to be?
0
 

Author Comment

by:pravink22
ID: 35174406
My output should be:

20.03.2011 00:35:42  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55  aaaa aa a : bb bb bb b b
19.03.2011 13:40:55  ccc cc c : sss ss
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 35174412
#!/usr/bin/perl
print "Enter the file name:";
$input = <STDIN>;
open (INFILE, "<$input");
open(OUTFILE, ">>out.csv");
while( <INFILE> ){
 if( /(\S+\s+\S+).*?(aaaa aa a.*)/ || /(\S+\s+\S+).*?(ccc cc c.*)/ ){
     print OUTFILE"$1 $2\n";
 }
}
close(INFILE);
close(OUTFILE);
0
 

Author Comment

by:pravink22
ID: 35174415
Its working Great !!! :-)
0
 

Author Comment

by:pravink22
ID: 35174908
How to get an output like this

date             Time        aaaa aa a       ccc cc c
20.03.2011 00:35:42  bb bb bb b b
19.03.2011 13:40:55  bb bb bb b b
19.03.2011 13:40:55                         sss ss


0
 
LVL 31

Expert Comment

by:farzanj
ID: 35175034
Try this
my $file = 'file1.txt';

open(INFO,"<$file") or die "Could not open the file";
my @lines = <INFO>;
close(INFO);
my $filter = "yyyyy|zzzzz|^$";
my @filter = grep(! /$filter/, @lines);
print "date             Time        aaaa aa a       ccc cc c";
print @filter;

Open in new window

0
 
LVL 31

Expert Comment

by:farzanj
ID: 35175040
Sorry a slight problem in the last one. Please ignore that
 
my $file = 'file1.txt';

open(INFO,"<$file") or die "Could not open the file";
my @lines = <INFO>;
close(INFO);
my $filter = 'yyyyy|zzzzz|^$';
my @filter = grep(! /$filter/, @lines);
print "date             Time        aaaa aa a       ccc cc c";
print @filter;

Open in new window

0
 
LVL 12

Expert Comment

by:tel2
ID: 35202539
Hi pravink22,

The above message says:

"Notice: pravink22 has requested that this question be closed by accepting pravink22's comment #35174415 (0 points) as the solution for the following reason:
Its working Great !!! :-)
..."

If it works great, prevink22, then I suggest you allocate points to the expert who made it work great.

I see that *after* you said it works great (so I guess your original request had been satisfied), you changed your requirements to a format which has got some of the data ("aaaa aa a       ccc cc c") in the heading line.  This is a problem, because:
  1. An expert has already gone to the trouble of writing code to satisfy your original requirements, but that expert has not been given any points for that effort.
  2. If you keep changing your requirements, it just adds more work to the experts, who would probably not have spent any time on this question if they knew, up front, that the requirements were going to change so much.
  3. Your latest requirements don't look very logical, since they contain part of one of the detail records in the heading line.  You have not explained the rules for working out what data to put in the heading line, so an expert may be able to make it work for the test data you've provided, but it may not work for some other data, which will mean that, once again, you will not be satisfied, and will probably ask for more changes.  I'm guessing that this is the main reason you've had no recent comments from experts.
  4. Even if you had not changed your requirements, you haven't responded to farzanj's latest post.  Yes, I know he hasn't tested his code, and it doesn't meet your first or latest requirements, but you should have told him that, and given farzanj a chance to fix it.

I suggest you award points to the expert who has provided a solution which met your original requirements, and if you still require the new format, explain the logic of it clearly in a new question.  Failing that, you might be hearing from a (real) moderator soon.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

627 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question