Parsing Files

I have many large files that contain data like the below example.  I would like to get some input on the best way to parse these files so that each part of the data (item number, description, price, etc) can be separated and then packed into an excel file (where the item number would be in like cell A1 then the description in cell A2).

Just fishing for some ideas on what the most efficient way of doing this may be.

Example 1:
1  1  020-140  CS  26 X 36 REG BLK GARBAGE BAG  12.80  YY  12.80  13-E1-04   250/CASE  91/SK

Example 2:
1  1  375-590  PC  MA612 12"BLUE HD PLST.DUSTPAN  2.30  YY  2.30  18-C2-02  EACH (12PCS/CS MA612)


Thanks
TyrantAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

HamdyHassanCommented:
Let's say you have a fixed length file, and
field one, from char 1 to char2
field two , from char 3 to char5
....

so here is shell script


f1=`cat file.txt| cut -c1-2`
f2=`cat file.txt| cut -c3-5`
echo $f1,$f2




0
TintinCommented:
Why not just convert it to a CSV file and import it into Excel?

0
prady_21Commented:
you can seperate each item based on spaces in to variables(if this is what you are looking for)
($item_no,$description, $price $something ... ) = split;

0
Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

TintinCommented:
Thinking about it, you can open the data as is into Excel (open as a txt file).  No conversions required.

Perhaps you're not telling us the full story.
0
TyrantAuthor Commented:
HamdyHassan: the spaces between fields is different.

Tintin: convert to CSV? text file is no good...need to I have each part of the data in its own cell

prady21: theres spaces in the description...which is also varried length

What I need to do is break each entry down to item #, description, price, package size, etc.  
Once I've got those individual values im going to be adding them into an excel spreadsheet into the matching cells.

Its 7:30am...sorry if i haven't explained it any clearer..try to once i get a couple of cups of coffee into me ;)

Thanks fors posting
0
HamdyHassanCommented:
"the spaces between fields is different" means it's not fixed length records? yes/no?




Do you have spaces inside the same field?
Like spaces in desc field for example?
If yes, let me know
If no, then we could use perl that split with one or more spaces


0
TyrantAuthor Commented:
no they're not fixed length records
and yes there is spaces in the description field
0
holliCommented:
basically a script for this would look similar to the following ready-to-go script.
it will parse your file line by line and match it against
a regular expression that splits the fields.
the fields are written into a new file, stripped of leading/trailing spaces and delimited by a semicolon.

you can easily import this new file into excel or a darabase:

#!perl -w
use strict;

my $file = $ARGV[0] or die "No input-file specified!";
die "file <$file> not found!" unless -e $file;
die "<$file> is not a file!" unless -f $file;

my $ofile = $ARGV[1] or die "No output-file specified!";
die "<$ofile> is a directory!" if -d $ofile;
die "file <$ofile> already exists!" if -e $ofile;

my $line="";

open IN, "<$file" or die "Cannot open file <$file>!\n$!";
open OUT, ">$ofile" or die "Cannot open file <$ofile> for writing!\n$!";

while ($line=<IN>)
{
  # example: $line = "1  5  020-140  CS  26 X 36 REG BLK GARBAGE BAG  12.80  YY  12.80  13-E1-04   250/CASE  91/SK";

  my @values = ( $line =~ m%^(\d)\s+                        #will match the first field (1)
                            (\d)\s+                        #will match the second field (5)
                            (\d{3}-\d{3})\s+               #will match the third field (020-140)
                            (\w\w)\s+                      #will match the fourth field (CS)
                            (.+?)\s{2,}                    #will math everything up to the price
                            (\d+\.\d+)\s+YY\s+             #will match the price, >YY< is ignored because it is always the same?
                            (\d+\.\d+)\s+                  #will match the second price
                            ([\d\w]+-[\d\w]+-[\d\w]+)\s+   #will match 13-E1-04
                            (.+)$%xi);                    #will match the rest

  print "line skipped: $line" unless @values;
  print OUT join ";", @values, "\n"; #replace ; by \t for copy and paste to excel
}

close IN;
close OUT;


running this script on a file with the examples you provided, the following output is generated:

1;1;020-140;CS;26 X 36 REG BLK GARBAGE BAG;12.80;12.80;13-E1-04;250/CASE  91/SK;
1;1;375-590;PC;MA612 12"BLUE HD PLST.DUSTPAN;2.30;2.30;18-C2-02;EACH (12PCS/CS MA612);



holli
0
holliCommented:
please replace

print "line skipped: $line" unless @values;
 print OUT join ";", @values, "\n"; #replace ; by \t for copy and paste to excel


by

if (@values)
  { print OUT join ";", @values, "\n"; #replace ; by \t for copy and paste to excel }
else
  { print "line skipped: $line"; }
 
0
TyrantAuthor Commented:
no they're not fixed length records
and yes there is spaces in the description field
0
TyrantAuthor Commented:
Thanks for the input guy but i found another solution...i didn't relize that each field had a set amount of characters to use.  
so now i break the strings apart and directly input them into they're excel cells

Ive never dealt with a Maytech system, (which this particular script logs into, gets the price quote, and parses it into a spreadsheet on the local users box), so Im still learning about it....quite quickly as much to my dismay :/

now...next question...how should i distribute the points?
;)
0
holliCommented:
count the characters each expert typed and calculate the percentage ;-)

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.