Big text file prepared for the web , Excel Adding up fields & total time

Hi,

Perl might be the best language to use it might not...   The web server is Linux.

I have a text file that needs to be formated for html pages and export to Excel.

Here's a sample of the file that I need to work with.

19/11/2004 14:48:31 InSet              InSet-9060         is-6ddc              amartinez            NOTICE     Batch.Started                            first_s_n(1), job_name(8000), layer_name(gnd18), stage_name(COPPER), lot_name(i5-8833)
19/11/2004 14:49:18 InSet              InSet-9060         is-6ddc              amartinez            NOTICE     Defect.Num                               lot_name(i5-8833), serial_num(1), defect_quantity(7)
19/11/2004 14:49:18 InSet              InSet-9060         is-6ddc              amartinez            NOTICE     Defect.Info                              lot_name(i5-8833), serial_num(1), cond(0), space(0), edge(0), ph(0), cs(5), cut(0), shrt(0), miss(0), padv(0), ovsz(0), smt(0), clr(3), tpv(0), arg(0), holv(0), dd(0), sssp(2), minc(0), maxc(0), ppflt(0), pptrd(0), dbg(0)
etc...
19/11/2004 14:58:16 InSet              InSet-9060         is-6ddc              amartinez            NOTICE     Batch.Ended                              last_s_n(24), job_name(8000), layer_name(gnd18), stage_name(COPPER), lot_name(i5-8833), total_boards(24)

Output

StartTime 19/11/2004 14:48:31
Job_Name 8000
lot_name(i5-8833)
defect_quantity(7)
Operator amartinez
End Time 19/11/2004 14:58:16
Total up  the serial_num(1), cond(0), space(0), edge(0), ph(0), cs(5), cut(0), shrt(0), miss(0), padv(0), ovsz(0), smt(0), clr(3), tpv(0), arg(0), holv(0), dd(0), sssp(2), minc(0), maxc(0), ppflt(0), pptrd(0), dbg(0)

End goal is to have this data displayed on the web and have and Excel link that this information can be imported to Excel for number cruching.

Thanks for your help.

PS

If I do not respond today I will be back in town on Monday..





jjcheap1Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

manav_mathurCommented:
Long answer to this.

Have you worked on it??

Manav
0
jjcheap1Author Commented:
Hi,

No I have not been able to spend much time on this. I have not been able to remove the white spaces and put in a  CR to start to format the data.


Thanks
 
0
manav_mathurCommented:
> remove the white spaces
for capturing different fields
you may use

my $variable=(split /  */$input_line)[k]

this will give you the 'k'th field in your line.

>put in a  CR to start to format the data.
a simple '\n'?? How are you trying to format it??

Manav

0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

ozoCommented:
Are there more than one records like that in the file?
How to you determine when one record ends and the next starts?
if there are different serial_num(1), fields or  job_name(8000), fields or lot_name(i5-8833), fields, which ones do you want to show?
0
jjcheap1Author Commented:
Hi,

I have not been able to spend more than 30 mins  on this. ( should have stated this  in the request).

The best I can tell is that the serial_num counts up to the # of qty done.

The job_name will change with each recipe and the lot_name will change with each new Batch.Started.

I would like to show the job_name , lot name and the serial_num of the last record total_boards(24) or

--> Record #1
19/11/2004 14:48:31 InSet              InSet-9060         is-6ddc              amartinez            NOTICE     Batch.Started                            first_s_n(1), job_name(8000), layer_name(gnd18), stage_name(COPPER), lot_name(i5-8833)
19/11/2004 14:49:18 InSet              InSet-9060         is-6ddc              amartinez            NOTICE     Defect.Num                               lot_name(i5-8833), serial_num(1), defect_quantity(7)
19/11/2004 14:49:18 InSet              InSet-9060         is-6ddc              amartinez            NOTICE     Defect.Info                              lot_name(i5-8833), serial_num(1), cond(0), space(0), edge(0), ph(0), cs(5), cut(0), shrt(0), miss(0), padv(0), ovsz(0), smt(0), clr(3), tpv(0), arg(0), holv(0), dd(0), sssp(2), minc(0), maxc(0), ppflt(0), pptrd(0), dbg(0)
--> Record #2
19/11/2004 14:49:40 InSet              InSet-9060         is-6ddc              amartinez            NOTICE
  Defect.Num                               lot_name(i5-8833), serial_num(2), defect_quantity(1)
19/11/2004 14:49:40 InSet              InSet-9060         is-6ddc              amartinez            NOTICE
  Defect.Info                              lot_name(i5-8833), serial_num(2), cond(0), space(0), edge(0), ph(0), cs(0), cut(0), shrt(0), miss(0), padv(0), ovsz(0), smt(0), clr(1), tpv(0), arg(0), holv(0), dd(0), sssp(0), minc(0), maxc(0), ppflt(0), pptrd(0), dbg(0)
--> Last record of the batch
19/11/2004 14:58:16 InSet              InSet-9060         is-6ddc              amartinez            NOTICE     Batch.Ended                              last_s_n(24), job_name(8000), layer_name(gnd18), stage_name(COPPER), lot_name(i5-8833), total_boards(24)

Thanks,

PS

This project should be worth 1000 points ...

0
ozoCommented:
if those "-->" reliably separate records, how's this?

$/="\n-->";
while( <> ){
print "StartTime ",m"\b(\d\d/\d\d/\d{4}\s+\d\d:\d\d:\d\d)\b","
Job_Name ",/job_name\((.+?)\)/i,"
lot_name",/lot_name(\(.*?\))/i,"
defect_quantity",/\bdefect_quantity(\(.*?\))/i,"
Operator ",/(\S+)\s+NOTICE\s/,"
End Time ",m".*\b(\d\d/\d\d/\d{4}\s+\d\d:\d\d:\d\d)\b"s,"
Total up  the ",/\bDefect\.Info.*?\b(serial_num.*)/,"

";
}
0
jjcheap1Author Commented:
Hi,

Thanks,

The "--> Record #1" , Record #2 ... are cuts from the data file( my comments).  Since the  serial_num of a lot can be qite large , I just trimed it up to give what the first, middle and end of the batch looks like.  

So is this the correct format ??

$/="\"InSet;

I am sorry but I have not done much with Perl.  A little bit more that "Hello World"

Thanks
0
manav_mathurCommented:
$/ sets the record seperator.
Ozo believed that \n--> was actually present in the data. But you say that --> is what you have inserted.

How exactly do you seperate records?? Does 'one record per line' satisfy your data??

Manav
0
jjcheap1Author Commented:
Manav and Ozo,

Thanks for your assistance.

Wow,

I can't believe of the response time !

As long as the data can be cleaned up and ready to export to Excel and displayed on a web page  in the following format is all that is needed.

The <!-- are my comments

27/11/2004 03:20:18 InSet              InSet-9060         is-6ddc              htran                NOTICE     Batch.Started                            first_s_n(1), job_name(8338), layer_name(gnd6), stage_name(COPPER), lot_name(i5-l8431) <!-- start of the record
27/11/2004 03:21:05 InSet              InSet-9060         is-6ddc              htran                NOTICE     Defect.Num <!-- next record                              lot_name(i5-l8431), serial_num(1), defect_quantity(1)
27/11/2004 03:21:05 InSet              InSet-9060         is-6ddc              htran                NOTICE     Defect.Info                              lot_name(i5-l8431), serial_num(1), cond(0), space(0), edge(0), ph(0), cs(0), cut(0), shrt(0), miss(0), padv(0), ovsz(0), smt(0), clr(1), tpv(0), arg(0), holv(0), dd(0), sssp(0), minc(0), maxc(0), ppflt(0), pptrd(0), dbg(0) <!-- This would be the record for  Excel
27/11/2004 03:21:25 InSet              InSet-9060         is-6ddc              htran                NOTICE     Defect.Num <!-- next record                              lot_name(i5-l8431), serial_num(2), defect_quantity(9)
27/11/2004 03:21:25 InSet              InSet-9060         is-6ddc              htran                NOTICE     Defect.Info                              lot_name(i5-l8431), serial_num(2), cond(0), space(0), edge(0), ph(0), cs(1), cut(0), shrt(0), miss(0), padv(0), ovsz(0), smt(0), clr(7), tpv(0), arg(0), holv(0), dd(0), sssp(1), minc(0), maxc(0), ppflt(0), pptrd(0), dbg(0) <!-- This would be the record for  Excel
27/11/2004 03:21:45 InSet              InSet-9060         is-6ddc              htran                NOTICE     Defect.Num <!-- next record                              lot_name(i5-l8431), serial_num(3), defect_quantity(1)
27/11/2004 03:21:45 InSet              InSet-9060         is-6ddc              htran                NOTICE     Defect.Info                              lot_name(i5-l8431), serial_num(3), cond(0), space(0), edge(0), ph(0), cs(0), cut(0), shrt(0), miss(0), padv(0), ovsz(0), smt(0), clr(1), tpv(0), arg(0), holv(0), dd(0), sssp(0), minc(0), maxc(0), ppflt(0), pptrd(0), dbg(0) <!-- This would be the record for  Excel

<!--ETC...
<!-- The following is the last line
27/11/2004 03:24:54 InSpire              InSpire-9060         is-6ddc              htran                NOTICE     Batch.Ended                              last_s_n(12), job_name(8338), layer_name(gnd6), stage_name(COPPER), lot_name(i5-l8431), total_boards(12) <!-- last record


Example web page.
Web page output

StartTime 27/11/2004 03:20:18
Job_Name 8000
Tool = is-6ddc
lot_name(i5-8833)
defect_quantity XX =  would be the defect_quantity summed up for each serial_num and total up.
total_boards(12)
Operator htran
End Time 27/11/2004 03:24:54

I hope that everything is cleaned up regarding my comments.

Again Thanks for your assistance.

0
manav_mathurCommented:
I dont see jobname 8000 anywhere?? Is it 8338.......??

Manav
0
jjcheap1Author Commented:
Hi,

Yes it is.

Sorry for the typo.  Too much copy and paste.


Thanks
0
manav_mathurCommented:
is that you have alternate records for Defect.Num and Defect.Info??

Manav
0
jjcheap1Author Commented:
Hi Manav,

The vendor is not much help. Looking through the data, this is how I understand the log files.

<!-- My comments

27/11/2004 03:21:05 InSet              InSet-9060         is-6ddc              htran                NOTICE     Defect.Num                             lot_name(i5-l8431),<!--Lot Name   serial_num(1), <!-- Unit # that increases with each unit  defect_quantity(1)<!-- # of defects  on serial_num(1)

27/11/2004 03:21:05 InSet              InSet-9060         is-6ddc              htran                NOTICE     Defect.Info                              lot_name(i5-l8431), serial_num(1), <!--{{cond(0), space(0), edge(0), ph(0), cs(0), cut(0), shrt(0), miss(0), padv(0), ovsz(0), smt(0), clr(1), tpv(0), arg(0), holv(0), dd(0), sssp(0), minc(0), maxc(0), ppflt(0), pptrd(0), dbg(0)  }} <!--the defect_quantity is the total between the {{ }}

Thanks
0
manav_mathurCommented:
Ok. I removed the comments in your data file and copy pasted it in my system.
I see that
- every record starts with a time stamp
- the first record is a batch.start
- the last record is a batch.end
- in between, there are alternating records with defect.info and defect.num.

If that is correct, this script should work for you.

#!/usr/bin/perl
use strict;
use warnings;
my @curr_line;
my ($job_name,$start_time,$operator_name,$tool_name);
my ($lot_name,$end_time,$total_boards,$defect,$total_defect) ;
$total_defect = 0;
open(INFILE,"< lot_file.dat") or die("$! : COuldnt open input file") ;
while (<INFILE>) {
if (m/Batch\.Start/i) {
@curr_line = split (/ +/,$_);
$start_time = "$curr_line[0]" . " " . "$curr_line[1]" ;
$tool_name = $curr_line[4] ;
$operator_name = $curr_line[5] ;
($job_name=$curr_line[9]) =~ s/job_name\((.+?)\),/$1/i ;
($lot_name=$curr_line[12]) =~ s/lot_name\((.*)\)\n/$1/i ;
}
if (m/Batch\.End/) {
@curr_line = split (/ +/,$_);
$end_time = "$curr_line[0]" . " " . "$curr_line[1]" ;
($total_boards=$curr_line[13]) =~ s/total_boards\((.*?)\)\n/$1/i ;
}
if (m/Defect\.Num/) {
@curr_line = split (/ +/,$_);
($defect = $curr_line[10]) =~ s/defect_quantity\((.*?)\)\n/$1/si ;
$total_defect = $total_defect+$defect ;
}
}
print "job name : $job_name\nStart time: $start_time\ntool name : $tool_name\nLot name : $lot_name\nOperator Name : $operator_name\n";
print "end time : $end_time\nTotal boards : $total_boards\n" ;
print "total defects : $total_defect\n"



Manav
0
jjcheap1Author Commented:
Thanks,

It has been a crazy day.

Very Cool !!

Thanks for the hand feeding..

Any suggestions for me next time regarding adding comments etc...

Closing this out how do you get the points ?

Again thanks for your assistance

Scott



0
manav_mathurCommented:
Thanks,

It has been a crazy day.
 : Sure has been

Very Cool !!

Thanks for the hand feeding..

Any suggestions for me next time regarding adding comments etc...

Closing this out how do you get the points ?

 : just click on 'accept' next to the comment. Then you will need to grade an answer. if you find my answer satisfactory-> A, slightly off the mark -> B, way off -> C. YOu cant award a D. See the help pages for more.

Again thanks for your assistance
 : anytime

Manav
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.