Solved

Extracting certian lines from a fixed length text file

Posted on 2003-10-28
5
238 Views
Last Modified: 2010-04-22
I have a fixed length text file I need to "Prune".  There is a date stamp in each record.  Characters 141-144 are a 4 digit year.  Chars 145-146 are two digit month and 147-148 are two digit day.

I would like to use "sed" or "awk" to create a new file FROM this file for all records with a date greater than or equal to 01/01/2003 (for example)

I've looked at both "sed" and "awk" and sort of understand how they work, but can't quite figure out how to get the fields from my file when I don't really have any field delimiter.

My ultimate goal is to have a shell script that will accept 3 parmaters.  "date", "inFile" and "outFile"

TIA
0
Comment
Question by:andersen58
5 Comments
 
LVL 38

Expert Comment

by:yuzh
ID: 9638530
It is a lot easy to do it with "cut" if you want to use "date" as paramter and ckeck it
agains the records in file.

Here's the example script

#!/bin/ksh
# useage: $0 date(dd/mm/yyyy) Infile OutFile"

INFILE=$2
OUFILE=$3

DD=`echo "$1 | cut -f1 -d/"`
MM=`echo "$1 | cut -f2 -d/"`
YY=`echo "$1 | cut -f2 -d/"`

#make sure OUFILE is blank, if you want to append records to this file
#just comment it out

cat /dev/null > $OUFILE

#Now read the input file, and  get the records
IFS="\012"    #read in the whole line, just in case you have white space in the records

exec 0<$INFILE

while read RECORD ; do
         RDD=`echo $RECORD | cut -c147-148 `
         RMM=`echo $RECORD | cut -c145-146 `
         RYY=`echo $RECORD | cut -c141-144 `
if [ "$RYY" -gt "$YY" ] ; then
         echo "$RECORD" >> $OUFILE         # > 2003
else
              {
              if [ "$RYY" -eq "$YY" -a "$RMM" -gt "$MM" ] ; then
                     echo "$RECORD" >> $OUFILE         #
              else
                  {
                   if [ "$RMM" -eq "$MM" -a  "$RDD" -ge "$DD" ] ; then
                         echo "$RECORD" >> $OUFILE        
                   fi
                   }
               fi
                }
fi

exit
# End of script

PS:
 you can also, modify the script to use:

if command1 ; then
    commands
elif command2 ; then
     commands
......
elif commandn ; then
      commands
else
      commands
fi

         
0
 
LVL 23

Expert Comment

by:brettmjohnson
ID: 9638536
0
 
LVL 7

Accepted Solution

by:
glassd earned 500 total points
ID: 9639988
You can use awk. It depends on your test date format. Assuming the format is as given (DD/MM/YYYY), and is held in the shell variable $Date:

awk -v d=$Date  '
   BEGIN{
      split(d,a,"/")
      t=sprintf("%s%s%s",a[3],a[2],a[1])
   }
   {
      b=substr($0,141,8)
      if(b <= t) {
         print $0
      }
  }' infile > outfile

The BEGIN section takes the variable d, which is passed into awk, and splits it into the array a[] using the / as a separator. The three elements are then reassembled in the correct order using the sprinf() function. Note that I have assumed UK date format as opposed to US format, so you may have to change this if you have MM/DD/YYYY.

The main section compares this date with the 8 character date string extracted from each line using the substr() function. If the test succeeds then the whole line is printed.


     
0
 
LVL 7

Expert Comment

by:glassd
ID: 9640010
Oops, sorry, got my test the wrong way round.

       if(b >= t) {
0
 

Author Comment

by:andersen58
ID: 9645828
Thanks glassd

  Added just a few lines to pass in command line variables and it worked like great.  Ran through a 5+ MB file and created a 1.6 MB "pruned down" file in about 1 second!

It don't get any better than that!

Good work.

PS. Yeah, I caught the test being backwards just AFTER I run it (ha!)
Looked at the output and went "Hmm, that ain't right"... Easy fix!
0

Featured Post

Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Have you ever been frustrated by having to click seven times in order to retrieve a small bit of information from the web, always the same seven clicks, scrolling down and down until you reach your target? When you know the benefits of the command l…
The purpose of this article is to fix the unknown display problem in Linux Mint operating system. After installing the OS if you see Display monitor is not recognized then we can install "MESA" utilities to fix this problem or we can install additio…
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question