sunny82
asked on
How do I solve this Perl Spreadsheet Parseexcel problem
Hi all,
For the attached excel sheet, I am going through all the columns and printing out the contents into an array and then to a file. But in my excel sheet, the data rows are arranged like this
File1.txt /ms/user/s/sayantag/ 1 Header1 Header2 Header3 Header4 Header5
File1.txt /ms/user/s/sayantag/ 1 Data1 Data2 Data3 Data4 Data5
File1.txt /ms/user/s/sayantag/ 1 Data6 Data7 Data8 Data9 Data10
File1.txt /ms/user/s/sayantag/ 1 Data11 Data12 Data13 Data14 Data15
In the output array and hence into the file, the data gets populated one after another like Data1Data2Data3Data4Data5D ata6Data7D ata8Data9D ata10Data1 1Data12Dat a13Data14D ata15.
Where am I going wrong? Should I put the contents of an entire row of data into another array and then print it out into the file?
Pls help asap with the code
The code can be run with--
perl Process_MF.pl 1 3 5 1 4 2 2
For the attached excel sheet, I am going through all the columns and printing out the contents into an array and then to a file. But in my excel sheet, the data rows are arranged like this
File1.txt /ms/user/s/sayantag/ 1 Header1 Header2 Header3 Header4 Header5
File1.txt /ms/user/s/sayantag/ 1 Data1 Data2 Data3 Data4 Data5
File1.txt /ms/user/s/sayantag/ 1 Data6 Data7 Data8 Data9 Data10
File1.txt /ms/user/s/sayantag/ 1 Data11 Data12 Data13 Data14 Data15
In the output array and hence into the file, the data gets populated one after another like Data1Data2Data3Data4Data5D
Where am I going wrong? Should I put the contents of an entire row of data into another array and then print it out into the file?
Pls help asap with the code
The code can be run with--
perl Process_MF.pl 1 3 5 1 4 2 2
#!/usr/bin/perl5.10 -w
use warnings;
use Spreadsheet::ParseExcel;
use strict;
use IO::Scalar;
use Data::Dumper;
my @etl_val = @ARGV;
print "@etl_val" . "\n" ;
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse("/ms/user/s/sayantag/Demo_MF2.xls");
my $worksheet;
my %order_of_flat_file;
my @data_arr;
my $data;
my $etl;
my $path;
my @fin_arr;
for $worksheet ( $workbook->worksheets() ) {
my ( $row_min, $row_max ) = $worksheet->row_range();
my ( $col_min, $col_max ) = $worksheet->col_range();
my $row = $row_min + 2;
while( $row <=$row_max ) {
print "Row before if statement is $row\n";
if ($worksheet->get_cell( $row, 2 )->value ne "ETL") {
print "Row after if statement is $row\n";
my $etl_cell = $worksheet->get_cell( $row, 2 );
if(defined($etl_cell)) {
$etl = $etl_cell->value;}
my $path_cell = $worksheet->get_cell( $row, 1 );
if(defined($path_cell)) {
$path = $path_cell->value;
$path =~ s/\s$//;
print "PATH: $path" . "\n";}
my $path_cell0 = $worksheet->get_cell( $row, 0 );
if(defined($path_cell0)) {
$path.= $path_cell0->value;
print $path_cell0->value . "\n";
print "PATH: $path" . "\n";
print "ETL: $etl" . "\n";
print "PATH: $path" . "\n";
}
for my $col ( $col_min+3 .. $col_max ) {
print "ROW : $row" ."\n";
print "COLUMN: $col" . "\n";
my $cell = $worksheet->get_cell( $row, $col );
if (defined($cell) ) {
my $cell_val = $cell->value;
print "Cell value is $cell_val\n";
push(@data_arr, split("\t", $cell_val));
# @data_arr = join ("\n", @data_arr);
# print "The first array element is $data_arr[0]\n";
print "Content of '", $path, "' will be:\n", @data_arr, "\n";
# foreach my $val(@data_arr) {$val=~ s/\t{1,}/\|/g;}
}
}
$row++;
}
else { print "The array looks like @data_arr\n";
open("FILEHANDLE", '>', $path) or die "Cannot open the file ", $path, ": $!";
print FILEHANDLE @data_arr ;
close ("FILEHANDLE");
@data_arr = ();
$row = $row+2;
print "Value of row before calling next is $row\n";
next;
}
}
}
#print "The array is @data_arr" . "\n";
Demo-MF2.xls
ASKER
IF I do ---
print FILEHANDLE join("\t", @data_arr); # print with tab between elemennts
print FILEHANDLE join(',', @data_arr); # separate elements with commas
the elements are getting printed as Data1 Data2 Data 3........Data15 in the file depending on if it is a space or a tab. But that is not what I need. If you see my attached excel spreadsheet, the data values are all in separate rows. Like Data1 to Data5 are in separate cells in a single row. Similarly Data6 to Data10 and Data11 to Data15. Like --
File1.txt /ms/user/s/sayantag/ 1 Header1 Header2 Header3 Header4 Header5
File1.txt /ms/user/s/sayantag/ 1 Data1 Data2 Data3 Data4 Data5
File1.txt /ms/user/s/sayantag/ 1 Data6 Data7 Data8 Data9 Data10
File1.txt /ms/user/s/sayantag/ 1 Data11 Data12 Data13 Data14 Data15
So in the file I want the data to be in the exact same way just separated by pipes.
So in my output File1.txt, I want the data to be in separate rows like in the excel sheet like this--
Data1|Data2|Data3|Data4|Da ta5
Data6|Data7|Data8|Data9|Da ta10
Data11|Data12|Data13|Data1 4|Data15
How do I do that? Thanks for your help, really appreciate it and eagerly waiting for your answer.
print FILEHANDLE join("\t", @data_arr); # print with tab between elemennts
print FILEHANDLE join(',', @data_arr); # separate elements with commas
the elements are getting printed as Data1 Data2 Data 3........Data15 in the file depending on if it is a space or a tab. But that is not what I need. If you see my attached excel spreadsheet, the data values are all in separate rows. Like Data1 to Data5 are in separate cells in a single row. Similarly Data6 to Data10 and Data11 to Data15. Like --
File1.txt /ms/user/s/sayantag/ 1 Header1 Header2 Header3 Header4 Header5
File1.txt /ms/user/s/sayantag/ 1 Data1 Data2 Data3 Data4 Data5
File1.txt /ms/user/s/sayantag/ 1 Data6 Data7 Data8 Data9 Data10
File1.txt /ms/user/s/sayantag/ 1 Data11 Data12 Data13 Data14 Data15
So in the file I want the data to be in the exact same way just separated by pipes.
So in my output File1.txt, I want the data to be in separate rows like in the excel sheet like this--
Data1|Data2|Data3|Data4|Da
Data6|Data7|Data8|Data9|Da
Data11|Data12|Data13|Data1
How do I do that? Thanks for your help, really appreciate it and eagerly waiting for your answer.
I've got to run now but I'll take a closer look at your code later tonight.
Generally, you need to output a line at a time. If you know there will always be 5 data items per line, then you can do the easy fix:
my @tmp = @data_arr;
while (@tmp) {
print FILEHANDLE join('|', splice(@tmp, 0, 5)), "\n";
}
Generally, you need to output a line at a time. If you know there will always be 5 data items per line, then you can do the easy fix:
my @tmp = @data_arr;
while (@tmp) {
print FILEHANDLE join('|', splice(@tmp, 0, 5)), "\n";
}
ASKER
I dont know if there will be 5 data items per line. There can be any no. of data items in a single excel spreadsheet row. Pls help me when you have time as I am really stuck and dont know what to do . As I said, In the output file File1.txt, I need the data rows in the same way as in the excel sheet. Like
Data1|Data2|Data3|Data4|Da ta5
Data6|Data7|Data8|Data9|Da ta10
Data11|Data12|Data13|Data1 4|Data15
At present my code goes through every row, finds out the etl and the path value and then calculates cell value for every column in that row and puts that value into the array. Then it does a $row++ to come to the next row and if it finds the header ETL, outputs the value into the file and empties the array and calls next to go to the next row iteration.
I am getting the output
Data1Data2Data3Data4Data5D ata6Data7D ata8Data9D ata10Data1 1Data12Dat a13Data14D ata15
while the output will be as above. Hope this will help.
Thanks.
Data1|Data2|Data3|Data4|Da
Data6|Data7|Data8|Data9|Da
Data11|Data12|Data13|Data1
At present my code goes through every row, finds out the etl and the path value and then calculates cell value for every column in that row and puts that value into the array. Then it does a $row++ to come to the next row and if it finds the header ETL, outputs the value into the file and empties the array and calls next to go to the next row iteration.
I am getting the output
Data1Data2Data3Data4Data5D
while the output will be as above. Hope this will help.
Thanks.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks a lot. It really worked. Really appreciate all your help.
ASKER
Hi,
I got the solution yesterday for this problem. But wanted to ask a related problem. Attached is the excel sheet I want to parse. It is working fine if all the data fields are there, then it puts a pipe between every data field and outputs it into the file. But if a data field is blank it should also put a pipe in its place.
For e.g.
if in my excel sheet,
File4.txt /ms/user/s/sayantag/ 4 Header13 Header14 Header15
File4.txt /ms/user/s/sayantag/ 4 Data26 Data27 Data30
File4.txt /ms/user/s/sayantag/ 4 Data28 Data29 Data31
The output for the code is Data26|Data27|Data30
Data28|Data29|Data31
which is correct.
But if the excel file has any blank data row like
File4.txt /ms/user/s/sayantag/ 4 Header13 Header14 Header15
File4.txt /ms/user/s/sayantag/ 4 Data26 Data30
File4.txt /ms/user/s/sayantag/ 4 Data28 Data29 Data31
The output in File4.txt should put a pipe in its place
i.e. the output should be Data26||Data30
Data28|Data29|Data30
Data28|Data29
How can I do that?
Pls help with the code asap.
Demo-MF4.xls
I got the solution yesterday for this problem. But wanted to ask a related problem. Attached is the excel sheet I want to parse. It is working fine if all the data fields are there, then it puts a pipe between every data field and outputs it into the file. But if a data field is blank it should also put a pipe in its place.
For e.g.
if in my excel sheet,
File4.txt /ms/user/s/sayantag/ 4 Header13 Header14 Header15
File4.txt /ms/user/s/sayantag/ 4 Data26 Data27 Data30
File4.txt /ms/user/s/sayantag/ 4 Data28 Data29 Data31
The output for the code is Data26|Data27|Data30
Data28|Data29|Data31
which is correct.
But if the excel file has any blank data row like
File4.txt /ms/user/s/sayantag/ 4 Header13 Header14 Header15
File4.txt /ms/user/s/sayantag/ 4 Data26 Data30
File4.txt /ms/user/s/sayantag/ 4 Data28 Data29 Data31
The output in File4.txt should put a pipe in its place
i.e. the output should be Data26||Data30
Data28|Data29|Data30
Data28|Data29
How can I do that?
Pls help with the code asap.
Demo-MF4.xls
Sorry I didn't get back sooner. My hard drive crashed yesterday, and I've been reinstalling everything from scratch!
I believe you need to remove line 60:
if (defined($cell) ) {
and the end of the block, line 68
}
What line 60 is saying is, Add this cell only if there is somethng in it.
I believe you need to remove line 60:
if (defined($cell) ) {
and the end of the block, line 68
}
What line 60 is saying is, Add this cell only if there is somethng in it.
If so, you need to do something to create "space" between the elements of the array.
print FILEHANDLE "@data_arr"; # print with single space between elements
print FILEHANDLE join("\t", @data_arr); # print with tab between elemennts
print FILEHANDLE join(',', @data_arr); # separate elements with commas