• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 30062
  • Last Modified:

Bash shell script - AWK Remove blank lines from text file while stripping certain characters

I have a text file im trying to parse out information after a certain line with a keyword. After this keyword all the information
below it is important to me, but for some reason its stripping all the escape characters or line characters that im needing
to process the information with another script. The other script I run takes this information and makes it delimited so I can
import into mysql. While peeking at the file with cat -vet I need it to look like this:

awk '/../&&d{print}/QUANTITY/{d=1}' test1.txt > test2.txt


FILE: test1.txt ORIGINAL FILE
---------------------------------------------------------------------
TABULATION^M$
QUANTITY^M$
^M$
^M$
    11111111  GIANTS                                   222.000     TON^M$
 7777      Acme Inc.                                           6.5000        3,425.50        3,425.50^M$
 8888      Pipe Ind, Inc.                                      1.0000          527.00          527.00^M$
^M$
    22222222  BEARS                                    324.000     TON^M$
7777      Acme, Inc.                                         148.3800        2,522.46        2,522.46^M$
8888      Pipe Ind, Inc..                               120.0000        2,040.00        2,040.00^M$
-----------------------------------------------------------------------

With awk it leaves in ^M$ after every line like the following and my other script wont read it right.

test2.txt comes out like this:

FILE: test2.txt
---------------------------------------------------------------------
    11111111  GIANTS                                   222.000     TON^M$
 7777      Acme Inc.                                           6.5000        3,425.50        3,425.50^M$
 8888      Pipe Ind, Inc.                                      1.0000          527.00          527.00^M$
    22222222  BEARS                                    324.000     TON^M$
7777      Acme, Inc.                                         148.3800        2,522.46        2,522.46^M$
8888      Pipe Ind, Inc..                               120.0000        2,040.00        2,040.00^M$
-----------------------------------------------------------------------


What I need test2.txt to look like is the following, I need to remove the ^M but leave the $ at the end of each line, ALSO!
very important for me to process I need to remove the 2 blank lines before the information like you see in the original
file after the word QUANTITY. It must look like this for my other script to work properly.

FILE: test2.txt
---------------------------------------------------------------------
    11111111  GIANTS                                   222.000     TON$
 7777      Acme Inc.                                           6.5000        3,425.50        3,425.50$
 8888      Pipe Ind, Inc.                                      1.0000          527.00          527.00$
$
    22222222  BEARS                                    324.000     TON$
7777      Acme, Inc.                                         148.3800        2,522.46        2,522.46$
8888      Pipe Ind, Inc..                               120.0000        2,040.00        2,040.00$
-----------------------------------------------------------------------

Thanks in advance for any help you can offer.

0
cybrthug
Asked:
cybrthug
  • 4
  • 3
1 Solution
 
ahoffmannCommented:
are ^M the two literal characters ^ and M, or is this a copy&paste from vi where ^M represents the carriage-return charater?

awk '/QUANTITY/{p=1;next}{p++}($1~/^\^M\$$/&&p<4){next}(p>3){print}' test1.txt|sed 's/\^M\$$/$/'
0
 
cybrthugAuthor Commented:
I believe ^M$ is the carriage-return character, but i need to remove the ^M and only have $ ending on each line.
0
 
cybrthugAuthor Commented:
If I use pico to edit the file with the ^M$ return character and resave it, I get the $ only at the end of the line, but I need to process this at the command line and not edit every single file.
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
ahoffmannCommented:
> I believe ^M$ ..
that is not sufficient, you have to be 101% sure, no doubt at all.
PLease check with od -c
0
 
cybrthugAuthor Commented:
With od -c I get  \r  \n at the end of each line.
0
 
ozoCommented:
awk '{gsub(/\r/,"");print}'
0
 
ahoffmannCommented:
and the $ is a real character, or was it the "end of line" marker of your editor?

to get rid of the \r (aka ^M aka Ctrl-M) use:
  tr -d '\015' <test1.txt

ozo, you need gawk, nawk for that ;-)

cybrthug, do you have awk, or any of gawk, nawk? check with awk -v
0
 
cybrthugAuthor Commented:
Ahoffmann, appreciate the responses, but ozo hit it on the head, again :) You are the bomb ozo!
0

Featured Post

Receive 1:1 tech help

Solve your biggest tech problems alongside global tech experts with 1:1 help.

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now