Bash shell script - Cut all text before processing rest of text file.

I have multiple text files im processing with a script, the only problem is, each text file has a bunch of junk in the beginning of it that i don't need. The good thing, I think, is with each file there is a catch word named DIFF that is right before the important part of the text files.
This is usually at the end of a line after all the junk like the following.

TEST QUOTE DIFF

234234 23 23 asdf 1234 asdf     (IMPORTANT DATA STARTS HERE)

I'm taking this info after the DIFF line and processing it to a comma delimited file, Can I just hunt for DIFF and start processing from there? How would I start this in a bash script.

Thanks in advance!
cybrthugAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
awk 'd{print}/DIFF/{d=1}' file | process
0
cybrthugAuthor Commented:
Ugh sorry I forgot something, there was more then one DIFF entry in the beginning, I'll up the points here for my mistake,

Instead of searching for DIFF I need to find:  EXTENSION DIFF,  there is a space in between so I need to start processing after I find the phrase "EXTENSION DIFF"
0
ozoCommented:
awk 'd{print}/EXTENSION DIFF/{d=1}' file | process
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

cybrthugAuthor Commented:
Is there a way I can use this to process multiple files of the same type from the command line, maybe take output of your awk into a new file:

awk 'd{print}/QUANTITY/{d=1}'  323424.txt

take what i get from this file and save it as:   a323424.txt  
then move on to the next file in the directory and do the same, basically batch process the entire directory of text files.
0
TintinCommented:

#!/bin/sh
for file in *.txt
do
  awk 'd{print}/QUANTITY/{d=1}' $file >a$file
done
0
cybrthugAuthor Commented:
There seems to be 2 empty lines after the QUANTITY line ive pulled information after, Is there a way to remove the 2 blank lines also after the QUANTITY line? Thanks for all your help guys!
0
ozoCommented:
Do you want to remove all blank lines?
The next 2 lines after QUANTITY?
Blank lines immediately following QUANTITY?
Is a line containing tabs and spaces blank?
0
cybrthugAuthor Commented:
After i cut all the text above the line that quantity is containted in using your awk statement, there are 2 blank lines following immediately below it, then the text that i need to process on the 3rd line. Is there a way to skip a blank line and move down to the next or skip 2 or more blank lines?
0
ozoCommented:
We can skip any 2 immediately following lines, any number of immediately following blank lines, up to 2 immediately following blank lines, all blank lines,
skipping all blank lines may be the simplest:
awk '/./&&d{print}/QUANTITY/{d=1}' $file >a$file
although that does not consider a line containing spaces and tabs to be empty
0
cybrthugAuthor Commented:
It must have tabs in it being formatted the way it is, since that is still leaving them in. probably need to skip the following 2 blank spaced lines.
0
ozoCommented:
awk 'NF&&d{print}/QUANTITY/{d=1}' $file >a$file
will not print any lines contining only spaces and tabs
awk 'NF&&d{d++}d>1{print}/QUANTITY/{d=1}'
will skip blank lines immediately following QUANTITY but print blank lines following a non-blank line following QUANTITY
0
cybrthugAuthor Commented:
They are removing everything including the QUANTITY line perfectly, but both those options still wont remove the next two lines. I did a check in the original text file and it appears each of the two lines appears completely blank as in you cant tab or move over with an arrow key, it immediately drops you down.
0
ozoCommented:
can you do a cat -vet of the lines following QUANTITY
0
cybrthugAuthor Commented:
I get ^M$ only on those lines
0
ozoCommented:
if there is a ^M at the end of all lines, then how about
 awk '/../&&d{print}/QUANTITY/{d=1}' $file >a$file
This will also skip lines that contain a single non ^M chararacter
0
cybrthugAuthor Commented:
Works great thank you very much, you have been very helpfull!
0
cybrthugAuthor Commented:
For some reason its stripping out something im needing to process the lines, when I run the awk above it parses the information I need fine but strips the $ which im needing for my processing script to work. Is there a way to leave in the $ without the ^M.  I will open another question and toss you some points after this since you've been of so much help. Thank you!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Unix OS

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.