Extracting certian lines from a fixed length text file

Posted on 2003-10-28
Medium Priority
Last Modified: 2010-04-22
I have a fixed length text file I need to "Prune".  There is a date stamp in each record.  Characters 141-144 are a 4 digit year.  Chars 145-146 are two digit month and 147-148 are two digit day.

I would like to use "sed" or "awk" to create a new file FROM this file for all records with a date greater than or equal to 01/01/2003 (for example)

I've looked at both "sed" and "awk" and sort of understand how they work, but can't quite figure out how to get the fields from my file when I don't really have any field delimiter.

My ultimate goal is to have a shell script that will accept 3 parmaters.  "date", "inFile" and "outFile"

Question by:andersen58
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
LVL 38

Expert Comment

ID: 9638530
It is a lot easy to do it with "cut" if you want to use "date" as paramter and ckeck it
agains the records in file.

Here's the example script

# useage: $0 date(dd/mm/yyyy) Infile OutFile"


DD=`echo "$1 | cut -f1 -d/"`
MM=`echo "$1 | cut -f2 -d/"`
YY=`echo "$1 | cut -f2 -d/"`

#make sure OUFILE is blank, if you want to append records to this file
#just comment it out

cat /dev/null > $OUFILE

#Now read the input file, and  get the records
IFS="\012"    #read in the whole line, just in case you have white space in the records

exec 0<$INFILE

while read RECORD ; do
         RDD=`echo $RECORD | cut -c147-148 `
         RMM=`echo $RECORD | cut -c145-146 `
         RYY=`echo $RECORD | cut -c141-144 `
if [ "$RYY" -gt "$YY" ] ; then
         echo "$RECORD" >> $OUFILE         # > 2003
              if [ "$RYY" -eq "$YY" -a "$RMM" -gt "$MM" ] ; then
                     echo "$RECORD" >> $OUFILE         #
                   if [ "$RMM" -eq "$MM" -a  "$RDD" -ge "$DD" ] ; then
                         echo "$RECORD" >> $OUFILE        

# End of script

 you can also, modify the script to use:

if command1 ; then
elif command2 ; then
elif commandn ; then

LVL 23

Expert Comment

ID: 9638536

Accepted Solution

glassd earned 2000 total points
ID: 9639988
You can use awk. It depends on your test date format. Assuming the format is as given (DD/MM/YYYY), and is held in the shell variable $Date:

awk -v d=$Date  '
      if(b <= t) {
         print $0
  }' infile > outfile

The BEGIN section takes the variable d, which is passed into awk, and splits it into the array a[] using the / as a separator. The three elements are then reassembled in the correct order using the sprinf() function. Note that I have assumed UK date format as opposed to US format, so you may have to change this if you have MM/DD/YYYY.

The main section compares this date with the 8 character date string extracted from each line using the substr() function. If the test succeeds then the whole line is printed.


Expert Comment

ID: 9640010
Oops, sorry, got my test the wrong way round.

       if(b >= t) {

Author Comment

ID: 9645828
Thanks glassd

  Added just a few lines to pass in command line variables and it worked like great.  Ran through a 5+ MB file and created a 1.6 MB "pruned down" file in about 1 second!

It don't get any better than that!

Good work.

PS. Yeah, I caught the test being backwards just AFTER I run it (ha!)
Looked at the output and went "Hmm, that ain't right"... Easy fix!

Featured Post

The Orion Papers

Are you interested in becoming an AWS Certified Solutions Architect?

Discover a new interactive way of training for the exam.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you ever been frustrated by having to click seven times in order to retrieve a small bit of information from the web, always the same seven clicks, scrolling down and down until you reach your target? When you know the benefits of the command l…
The purpose of this article is to demonstrate how we can upgrade Python from version 2.7.6 to Python 2.7.10 on the Linux Mint operating system. I am using an Oracle Virtual Box where I have installed Linux Mint operating system version 17.2. Once yo…
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question