Link to home
Start Free TrialLog in
Avatar of Shailesh Shinde
Shailesh ShindeFlag for India

asked on

Excel to CSV conversion with specific columns

Hello All,

Got many references from web for excel (.xlsx or .xls) to csv conversion using perl script like below the one.

#!/usr/bin/env perl

use strict;
use warnings;
use Carp;
use Getopt::Long;
use Pod::Usage;
use File::Basename qw/fileparse/;
use File::Spec;
use Spreadsheet::ParseExcel;
use Spreadsheet::XLSX;

my %args = ();
my $help = undef;
GetOptions(
           \%args,
           'excel=s',
           'sheet=s',
           'man|help'=>\$help,
          ) or die pod2usage(1);

pod2usage(1) if $help;
pod2usage(-verbose=>2, exitstatus=>0, output=>\*STDERR) unless $args{excel} || $args{sheet};

if (_getSuffix($args{excel}) eq ".xls") {
    my $file = File::Spec->rel2abs($args{excel});

    if (-e $file) {
        print _XLS(file=>$file, sheet=>$args{sheet});
    } else {
        die "Error: can not find file";
    }
}
elsif (_getSuffix($args{excel}) eq ".xlsx") {
    my $file = File::Spec->rel2abs($args{excel});

    if (-e $file) {
        print _XLSX(file=>$file, sheet=>$args{sheet});
    } else {
        die "Error: Can not find file";
    }
}

sub _XLS {
    my %opts = (
                file   => undef,
                sheet  => undef,
                @_,
               );

    my $aggregated = ();
    my $parser     = Spreadsheet::ParseExcel->new();
    my $workbook   = $parser->parse($opts{file});

    if (!defined $workbook) {
        croak "Error: $parser->error()";
    }

    foreach my $worksheet ($workbook->worksheet($opts{sheet})) {
        my ($row_min, $row_max) = $worksheet->row_range();
        my ($col_min, $col_max) = $worksheet->col_range();

        foreach my $row ($row_min .. $row_max){
            foreach my $col ($col_min .. $col_max){
                my $cell = $worksheet->get_cell($row, $col);
                if ($cell) {
                    $aggregated .= $cell->value().',';
                } else {
                    $aggregated .= ',';
                }
            }
            $aggregated .= "\n";
        }
    }
    return $aggregated;
}

sub _XLSX {
    my %opts = (
                file   => undef,
                sheet  => undef,
                @_,
               );

    my $aggregated_x = ();
    my $excel = Spreadsheet::XLSX->new($opts{file});

    foreach my $sheet (@{ $excel->{Worksheet} }) {
        if ($sheet->{Name} eq $opts{sheet}) {
            $sheet->{MaxRow} ||= $sheet->{MinRow};

            foreach my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) {
                $sheet->{MaxCol} ||= $sheet->{MinCol};
                foreach my $col ($sheet->{MinCol} ..  $sheet->{MaxCol}) {
                    my $cell = $sheet->{Cells}->[$row]->[$col];
                    if ($cell) {
                        $aggregated_x .= $cell->{Val}.',';
                    }
                }
                $aggregated_x .= "\n";
            }
        }
    }
    return $aggregated_x;
}

sub _getSuffix {
    my $f = shift;
    my ($basename, $dirname, $ext) = fileparse($f, qr/\.[^\.]*$/);
    return $ext;
}


__END__

=head1 NAME

xls2csv - Converting XLS/XLSX file to CSV

=head1 SYNOPSIS

perl xls2csv --excel data.xls|.xlsx --sheet Sheet1

=head1 OPTIONS

 -e,  --excel     Given a XLS or XLSX file.         [Required]
 -s,  --sheet     Given a sheet name of the file.   [Required]
 -h,  --help      Show help messages.

=head1 DESCRIPTION

This program converts .xls and .xlsx file to csv.

=cut

Open in new window



However, I am looking for this conversion with specific columns(column B, Column F and Column G only) data into csv format using perl script.

Any reference or example of this would be really helpful.

Thanks & Best Regards,
Shail
Avatar of FishMonger
FishMonger
Flag of United States of America image

Instead of looping over all of the columns, just grab the contents of the specific cells you want.  I have not run any tests, but it should be something like this:

        foreach my $row ($row_min .. $row_max){
            my @cells = $worksheet->get_cell($row, 2),
                        $worksheet->get_cell($row, 6),
                        $worksheet->get_cell($row, 7);
                        
            # process array as needed
            # personnaly, I'd output it directly to the csv file
            # via Text::CSV_XS
        }

Open in new window

Avatar of Shailesh Shinde

ASKER

Hi,
Thanks for the reply!
I have to copy full column data (column B, Column F and Column G only)  instead of specific row.

Thanks,
Shail
ASKER CERTIFIED SOLUTION
Avatar of FishMonger
FishMonger
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
On an OT note, you have some code duplication which you might want to refactor to something like this.
my $file = File::Spec->rel2abs($args{excel});
if (! -e $file) { die "Error: can not find file"; }

my $suffix = _getSuffix($args{excel});

if ($suffix eq ".xls") {
    print _XLS(file=>$file, sheet=>$args{sheet});
}
elsif ($suffix eq ".xlsx") {
    print _XLSX(file=>$file, sheet=>$args{sheet});
}

Open in new window

Thanks! This approach seems to be easier to do.