Link to home
Start Free TrialLog in
Avatar of faithless1
faithless1

asked on

Remove extra spaces, empty lines, dates,

Hi,

How can I remove multiple spaces and empty lines? Also, I need to remove all single digits from a large file.

Here's what the file looks like:

     aword aword aword           aword   aword

bword bword             bword

1
3
2
4
5

3    #40,000 sss ss           ss $1000 # In this case I would want to remove 1,3,2,4,5,3 but not #40,000 or $1000


1 2 3 4 5 6 8 9 #Need to remove any standalone character 1-9


www Aug 21, 2007 #need to remove any instance of www


Oct 29, 2008 # Need to remove any occurrence of a date


Thanks a lot in advance.








 
Avatar of jeromee
jeromee
Flag of United States of America image

The following:
perl -e'open(F,$ARGV[0])||die; $_=join("\n",<F>); s/(\s)\s+/$1/gm; 1 while( s/(^\s|^\d\n|^\d\s)//gm); s/\s*www\s*//g; print' my_big_file

Returns

aword aword aword aword aword
bword bword bword
40,000 sss ss ss $1000Aug 21, 2007
Oct 29, 2008

is that what you wanted?


Avatar of faithless1
faithless1

ASKER

Superb thanks! I also wanted to remove any instance of a date that follows that format (Aug 21, 2007 etc). Thanks again
ASKER CERTIFIED SOLUTION
Avatar of jeromee
jeromee
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Glad I was able to help.
Happy Perling!