Link to home
Start Free TrialLog in
Avatar of dprasad
dprasad

asked on

Remove leading whitespaces

I have a text file in the format:

(Future) Ch. Giscours Margaux 750ml 12 per case
WA 90-92 points, This saturated ruby/purple-colored effort offers notes of liquid minerals interspersed with subtle toasty oak, black currant, graphite and a hint of flowers.
$28 per 750ml Bottle      
$317 per Case of 12      
       2002 Future
       Red Blend
       Bordeaux
       France
       Red
       Margaux
(Future) Ch. Giscours Margaux 750ml 12 per case
WS 92-94 points, Intense aromas of cooked fruit and prunes. Full-bodied, with rich and ripe fruit character. Soft, round tannins and a long finish. Heady and rich.  
$38 per 750ml Bottle
$448 per Case of 12
       2003 Future
       Red Blend
       Bordeaux
       France
       Red
        Margaux       
      
The white spaces at the beginning of each line need to be eliminated so they start at the beginning. Need to adapt this script to read an outside text file and properly reformat it:

#!/usr/bin/perl
use strict;

my $a="   parse text    ";

$a =~ s/(^ *)||( *$)//g;
print "$a.\n";
Avatar of dprasad
dprasad

ASKER

I tried this:

#!/usr/bin/perl
use strict;



$outfile = 'out.txt';
 
open (FILE, 'mydata.txt') or die "cannot open $file: $!"; # opens the file
open (outfile, 'out.txt') or die "cannot open $file: $!"; # opens the file

@my_file = <FILE>;





@my_file =~ s/(^ *)||( *$)//g;
print outfile "@my_file.\n";

gives error:

Global symbol "$outfile" requires explicit package name at p1.pl line 6.
Global symbol "$file" requires explicit package name at p1.pl line 8.
Global symbol "$file" requires explicit package name at p1.pl line 9.
Global symbol "@my_file" requires explicit package name at p1.pl line 11.
Global symbol "@my_file" requires explicit package name at p1.pl line 17.
Global symbol "@my_file" requires explicit package name at p1.pl line 18.
Execution of p1.pl aborted due to compilation errors.
Avatar of Tintin
No need for a script, just do:

perl -pe 's/^\s+//' mydata.txt >out.txt

Try this

#!/usr/bin/perl
use strict;

my $outfile = 'out.txt';
open (FILE, "<mydata.txt") or die "cannot open mydata.txt: $!"; # opens the file
open (outfile, ">out.txt") or die "cannot open $outfile: $!"; # opens the file

while <FILE>
{
   $_ =~ s/(^ *)||( *$)//g;
   print(outfile "$_\n";
}
close(outfile);
Avatar of dprasad

ASKER

hmm ok. When I do that, the output file is in the same format as the input, i.e. no changes are made
If you were to write it as a script, I'd do it like:

#!/usr/bin/perl
use strict;

my $data = 'mydata.txt';
my $outfile = 'out.txt';

open FILE, $data or die "Can not open $data $!\n";
open OUTPUT, ">$outfile" or die "Can not open $outfile $!\n";

while (<FILE>) {
  s/^\s+/;
  print OUTPUT;
}

Now, your script sample also tries to delete trailing whitespace, if you need that, add

s/\s+$//;

to the while loop.
Avatar of dprasad

ASKER

ok, yeah I just need the leading spaces. I get the error:

Substitution replacement not terminated at p1.pl line 11.

which is: s/^\s+/;
i made a few errors, this one works for me

my $outfile = 'out.txt';
open (FILE, "<mydata.txt") or die "cannot open mydata.txt: $!"; # opens the file
open (outfile, ">out.txt") or die "cannot open $outfile: $!"; # opens the file

while (<FILE>)
{
      chomp;
   $_ =~ s/(^ *)||( *$)//g;
   print(outfile "$_\n");
}
close(outfile);
Avatar of dprasad

ASKER

hmmm... ok Im copying and pasting. Only the very last line got its whitespace cut out. All of the rest are still the same.
here is the outfile produced by my code, all white space stripped. what version of perl are u using?
---
Future) Ch. Giscours Margaux 750ml 12 per case
WA 90-92 points, This saturated ruby/purple-colored effort offers notes of liquid minerals interspersed with subtle toasty oak, black currant, graphite and a hint of flowers.
$28 per 750ml Bottle
$317 per Case of 12
2002 Future
Red Blend
Bordeaux
France
Red
Margaux
(Future) Ch. Giscours Margaux 750ml 12 per case
WS 92-94 points, Intense aromas of cooked fruit and prunes. Full-bodied, with rich and ripe fruit character. Soft, round tannins and a long finish. Heady and rich.
$38 per 750ml Bottle
$448 per Case of 12
2003 Future
Red Blend
Bordeaux
France
Red
Margaux
Avatar of dprasad

ASKER

(Future) Ch. Giscours Margaux 750ml 12 per case
WS 92-94 points, Intense aromas of cooked fruit and prunes. Full-bodied, with rich and ripe fruit character. Soft, round tannins and a long finish. Heady and rich.
$38 per 750ml Bottle
$448 per Case of 12
      2003 Future
      Red Blend
      Bordeaux
      France
      Red
Margaux       
Avatar of dprasad

ASKER

not sure, how do i find out?
on command line type
perl -v
Avatar of dprasad

ASKER

This is perl, v5.8.4 built for MSWin32-x86-multi-thread
(with 3 registered patches, see perl -V for more detail)
should be fine then, have you tried my last program exactly as entered above?
Avatar of dprasad

ASKER

yes, I copied and pasted it into a file called p2.pl

run with


perl p2.pl

weird.
I can simplify it a bit, can you check the creation date of out.txt to see it is the latest generated by the program?

my $outfile = 'out.txt';
open (FILE, "<mydata.txt") or die "cannot open mydata.txt: $!"; # opens the file
open (outfile, ">out.txt") or die "cannot open $outfile: $!"; # opens the file

while (<FILE>)
{
   chomp;
   $_ =~ s/^ *//g;
   print(outfile "$_\n");
   print ">>>$_<<<\n";  #this will echo output to the console for checking
}
close(outfile);
Avatar of dprasad

ASKER

yes, the out.txt is up to date. I was deleting it after each execution just to be sure. On that last bit, I get this spit out to the screen:

>>>WS 92-94 points, Intense aromas of cooked fruit and prunes. Full-bodi
 rich and ripe fruit character. Soft, round tannins and a long finish. H
 rich.  <<<
>>>$38 per 750ml Bottle<<<
>>>$448 per Case of 12<<<
>>>     2003 Future <<<
>>>     Red Blend <<<
>>>     Bordeaux <<<
>>>     France <<<
>>>     Red <<<
>>>Margaux      <<<

but no changes appear in out.txt.

hate to ask this, but is there any way I could mail you the file? I'm a perl newbie, I wonder if theres something wrong with my installation.. The file is 181 kb. I woukld really appreciate it
ASKER CERTIFIED SOLUTION
Avatar of Kim Ryan
Kim Ryan
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of dprasad

ASKER

ahhhhh cool, yes that worked! thanks a lot for the help

Dinesh
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of dprasad

ASKER

It's all good, I was starting to feel my carpel tunnel tire out from manually taking those damn tabs out, phew
glad it's working. As I helped solve the question by identifying the tabs (white space) problem it would have been nice to  also have received some points :)
Avatar of dprasad

ASKER

woops, sorry about that I wasnt paying attention!! I will re assign points, my bad
If you are on windows, an earlier suggestion by Tintin might be modified to suit your requirements

perl -pe "s/^\s+//" mydata.txt >out.txt

or if you want to change te original file itself,

perl -i.bak -pe "s/^\s+//" mydata.txt

this will change mydata.txt and save the original file as mydata.txt.bak
teraplane, you don;t need the $_ explicitly in s///
also, it might be beneficial to getthe filehandle "outfile" into caps. I believe warnings.pm complains about that??

open(OUTFILE,"> $outfile") or die("$! : can;t open output file") ;
select(OUTFILE) ; ##dunno for sure, but I believe this helps coz then print doesnt have to select OUTFILE everytime.
while (<FILE>)
{
  s/^\s*//g ;
  print ;
}

Manav
teraplane, you don;t need the $_ explicitly in s///
also, it might be beneficial to getthe filehandle "outfile" into caps. I believe warnings.pm complains about that??

open(OUTFILE,"> $outfile") or die("$! : can;t open output file") ;
select(OUTFILE) ; ##dunno for sure, but I believe this helps coz then print doesnt have to select OUTFILE everytime.
while (<FILE>)
{
  s/^\s*//g ;
  print ;
}
close(OUTFILE) ;

Manav
#!/usr/bin/perl
use strict;



#Global symbol "$outfile" requires explicit package name at p1.pl line 6.
#$outfile = 'out.txt';
my  $outfile = 'out.txt';

#Global symbol "$file" requires explicit package name at p1.pl line 8.
my $file = 'mydata.txt';
open (FILE, 'mydata.txt') or die "cannot open $file: $!"; # opens the file
#open (outfile, 'out.txt') or die "cannot open $file: $!"; # opens the file
open (OUTFILE, '>out.txt') or die "cannot open $outfile: $!"; # opens the file

#Global symbol "@my_file" requires explicit package name at p1.pl line 11.
#@my_file = <FILE>;
my @my_file = <FILE>;




#@my_file =~ s/(^ *)||( *$)//g;
s/^\s*// for @my_file;
#print outfile "@my_file.\n";
print OUTFILE @my_file;