Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Remove extra spaces, empty lines, dates,

Posted on 2010-11-17
4
Medium Priority
?
512 Views
Last Modified: 2012-06-21
Hi,

How can I remove multiple spaces and empty lines? Also, I need to remove all single digits from a large file.

Here's what the file looks like:

     aword aword aword           aword   aword

bword bword             bword

1
3
2
4
5

3    #40,000 sss ss           ss $1000 # In this case I would want to remove 1,3,2,4,5,3 but not #40,000 or $1000


1 2 3 4 5 6 8 9 #Need to remove any standalone character 1-9


www Aug 21, 2007 #need to remove any instance of www


Oct 29, 2008 # Need to remove any occurrence of a date


Thanks a lot in advance.








 
0
Comment
Question by:faithless1
  • 3
4 Comments
 
LVL 10

Expert Comment

by:jeromee
ID: 34161350
The following:
perl -e'open(F,$ARGV[0])||die; $_=join("\n",<F>); s/(\s)\s+/$1/gm; 1 while( s/(^\s|^\d\n|^\d\s)//gm); s/\s*www\s*//g; print' my_big_file

Returns

aword aword aword aword aword
bword bword bword
40,000 sss ss ss $1000Aug 21, 2007
Oct 29, 2008

is that what you wanted?


0
 

Author Comment

by:faithless1
ID: 34161375
Superb thanks! I also wanted to remove any instance of a date that follows that format (Aug 21, 2007 etc). Thanks again
0
 
LVL 10

Accepted Solution

by:
jeromee earned 2000 total points
ID: 34161712
Here you go:

perl -e'open(F,$ARGV[0])||die; $_=join("\n",<F>); s/(\s)\s+/$1/gm; 1 while( s/(^\s|^\d\n|^\d\s)//gm); s/\s*www\s*//g; s/(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+\d+, \d{4}\s*//g; print' my_big_file

aword aword aword aword aword
bword bword bword
40,000 sss ss ss $1000
0
 
LVL 10

Expert Comment

by:jeromee
ID: 34177998
Glad I was able to help.
Happy Perling!
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Background Still having to process all these year-end "csv" files received from all these sources (including Government entities), sometimes we have the need to examine the contents due to data error, etc... As a "Unix" shop, our only readily …
Utilizing an array to gracefully append to a list of EmailAddresses
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Six Sigma Control Plans
Suggested Courses

810 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question