Avatar of fabiano petrone
fabiano petroneFlag for Italy

asked on 

xlsx/xls -> csv conversion

Hello,

***UPDATE****
As I use OpenOffice on Windows 7 for manually making the work that follows, maybe that an OpenOffice Guru can find a better solution than a Perl script, so I've added also OpenOffice in the topics
***************

I'm searching for a good solution for converting Windows xlsx/xls files (Windows 1252 WinLatin 1) in csv (UTF-8) files with a script like:

perl script.pl<file.xlsx>file.csv

my environment is Windows 7 with ActiveState Perl 5.24.0

The script should also "cut-off" the first line & the first column of every Excel spreadsheet

Thanks a lot,

Fabiano
PerlProgrammingOffice Suites-OtherScripting Languages

Avatar of undefined
Last Comment
Bill Prew
Avatar of Bill Prew
Bill Prew

Just to be clear, you don't have Microsoft Office installed, only Apache OpenOffice?


»bp
Avatar of tel2
tel2
Flag of New Zealand image

Hi fabiano,

After you've answered Bill's question, here's a possible starting point for .xls files:

Have a look at this script on CPAN:
  http://search.cpan.org/~ken/xls2csv-1.06/script/xls2csv
Here's how it's run:
  xls2csv -x "1252spreadsheet.xls" -b WINDOWS-1252 -c "ut8csvfile.csv" -a UTF-8

However, this is just a starting point, because:
- It seems to handle .xls files only, not .xlsx.
- It would need some modification to cut off row 1 & column A.
If you don't get any better offers, you might like to try it to see if it works in all other respects, though, and then report back.

For xlsx files, maybe these posts from PerlMonks might help:
  http://perlmonks.org/?node=xlsx+csv

And the Spreadsheet::Read module seems to be compatible with .xls, .xlsx, etc.
  http://search.cpan.org/~hmbrand/Spreadsheet-Read-0.74/Read.pm

What sheet of the workbook should your script work on?  Just the first one?  What would that sheet be called?  "Sheet1"?
Avatar of fabiano petrone

ASKER

Hi, Bill
yes: for now only OOffice and the files are now "manually saved" in UTF8 csv format (with "tab" as field separator).
To tell the truth I can also go to Excel if no solutions are available: do you have a script for this platform?
to summarize, I need:
1) a command line script that has the xls file as an input and the csv file as an output, say something:
script <file.xls>file.csv
In fact I've a lot of files to process, and I want to put all the lines in an unique .bat
2) the script should crop the first line & the first column of every xls file
2) the content of the xls file should be saved in csv, UTF8 encoding & "tab" separator

Thanks a lot,
Fabiano
Avatar of fabiano petrone

ASKER

Hi, tel2
Thanks for the suggestions.
Regarding xlx2csv I've installed it on my active perl environment, but your sample give me a "bad parameters" error.
calling the .pl file alone I obtain the following output:

Usage: C:\Perl64\eg\alma\personale\test\xls2csv.pl filename sheet-colrow:colrow
[-rotate]
   eg: C:\Perl64\eg\alma\personale\test\xls2csv.pl filename.xls 1-A1:B12
       C:\Perl64\eg\alma\personale\test\xls2csv.pl filename.xls A1:M1 -rotate

Open in new window


so it seems that I must know the number of the rows and of the columns before launching the script, and that's no good.
I'll give a try also to your other suggestions: Thanks
Fabiano
Avatar of Bill Prew
Bill Prew

In Windows, with Microsoft Excel installed, this would be pretty easy to do in a VBscript (VBS file).  Since you have a number of files to process, I would say let the script process all the files in a given directory, this would be more efficient since a copy of Excel would not have to be started and stopped for each file, the same instance could be used.

I don't have one that would be exactly what you are looking for, but this is pretty close and could be a good starting point if we went this way.  Just to give you an idea of the coding.  It could be enhanced to make the other changes to the file you need inside the loop.

Const cExcelCSV = 6

strDir = "C:\Temp\Files"

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = False
objExcel.DisplayAlerts = False

For Each objFile In objFSO.GetFolder(strDir).Files
    strExt = objFSO.GetExtensionName(objFile.Path)
    If LCase(Left(strExt, 3)) = "xls" Then
        Set objWorkbook = objExcel.Workbooks.Open(objFile.Path, False, True)
        objWorkbook.SaveAs Replace(strPath, "." & strExt, ".csv"), cExcelCSV
        objWorkbook.Close False
        Set objWorkbook = Nothing
    End If
Next

objExcel.Quit
Set objExcel = Nothing
Set objFSO = Nothing

Open in new window



»bp
Avatar of fabiano petrone

ASKER

Hi, Bill
Thanks a lot for your script. I've modified only the line:
strDir = "C:\Perl64\eg\alma\files"
Then I've launched the script and I've obtained the attached error (an 800A03EC error)
sorry for the italian.
I try to traslate:
row 15 char 9: impossible to open the file.
try one of the following options: etc.
Thanks,
Fabiano
error.jpg
Avatar of Bill Prew
Bill Prew

Sorry, missed a change as I re-purposed a prior script, adjust the following line:

       objWorkbook.SaveAs Replace(objFilePath, "." & strExt, ".csv"), cExcelCSV


»bp
Avatar of fabiano petrone

ASKER

hi,
just substituted the line but I've obtained the same error.
I enclose also a sample.xls so you can test your script
Thanks,
Fabiano
sample.xls
Avatar of Bill Prew
Bill Prew

Very sorry, seem to have lost a period there...

objWorkbook.SaveAs Replace(objFile.Path, "." & strExt, ".csv"), cExcelCSV

Assuming you are running this on Windows you might want to run from a command prompt window too, with a command line like:

cscript yourscript.vbs

That way you will see any errors or other logging we might add to the console.


»bp
Avatar of fabiano petrone

ASKER

Hi,
done, but the output is more or less the same.
I enclose a screenshot
bye,
Fabiano
console.jpg
Avatar of Bill Prew
Bill Prew

Did you add the period into line 15 per my last post, that fixed it here.


»bp
Avatar of Bill Prew
Bill Prew

Here is a version with a couple of displays of the input and output files it processes, run this if still getting error.

Const cExcelCSV = 6

strDir = "B:\EE\EE29066306\Files"

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = False
objExcel.DisplayAlerts = False

For Each objFile In objFSO.GetFolder(strDir).Files
    strExt = objFSO.GetExtensionName(objFile.Path)
    If LCase(Left(strExt, 3)) = "xls" Then
        Wscript.Echo "Reading file: [" & objFile.Path & "]"
        Set objWorkbook = objExcel.Workbooks.Open(objFile.Path, False, True)
        strPath = Replace(objFile.Path, "." & strExt, ".csv")
        Wscript.Echo "Writing file: [" & strPath & "]"
        objWorkbook.SaveAs strPath, cExcelCSV
        objWorkbook.Close False
        Set objWorkbook = Nothing
    End If
Next

objExcel.Quit
Set objExcel = Nothing
Set objFSO = Nothing

Open in new window


»bp
Avatar of fabiano petrone

ASKER

Hi, Bill
now it's OK: but the separator should be the "tab" and not the "comma" (see my 2h ago comment).
Thanks,
Fabiano
Avatar of Bill Prew
Bill Prew

Yes, I wasn't proposing this as a full solution, just a sample of how it would look.

If you think this could work for you I will adjust further.


»bp
Avatar of fabiano petrone

ASKER

Hi,
yes, surely it could work for me: if you insert the "tab" separator instead of the "comma" it's perfectly suited to my needs
Thanks,
Fabiano
ASKER CERTIFIED SOLUTION
Avatar of Bill Prew
Bill Prew

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of fabiano petrone

ASKER

Hi, perfect Thanks :=)
Avatar of tel2
tel2
Flag of New Zealand image

Hi Bill,
Well done in your skill and persistence in solving this problem!
I don't know much VBA, but what part of your code does this:
   '"cut-off" the first line & the first column of every Excel spreadsheet'?
And what part does converts from Windows 1252 WinLatin 1 to UTF-8?
And does your code just process the first sheet of each workbook, or what?

Hi Fabiano,
> 'Regarding xlx2csv I've installed it on my active perl environment, but your sample give me a "bad parameters" error.'
Without seeing the exact command you issued, I can't help you.  I've just tested it myself, with the sample parameters I gave in my post, and it worked for me on Linux.  But it seems you have a simpler and more complete solution from Bill now, anyway.

tel2
SOLUTION
Avatar of Bill Prew
Bill Prew

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Avatar of tel2
tel2
Flag of New Zealand image

Nice work, Bill!  Seems to work for me.  Give yourself a pay rise.

BTW, what's the main advantage of doing this in VBScript vs VBA?  Is it just to be able to have the program in a little plain text file without the need to embed it in a spreadsheet?
Would the code be much different to doing in VBA?

TRS
Avatar of Bill Prew
Bill Prew

Code would be very similar, yes.  For "batch" things like this I find VBS a better fit, where there is no need for a UI, etc.


»bp
Programming
Programming

Programming includes both the specifics of the language you’re using, like Visual Basic, .NET, Java and others, but also the best practices in user experience and interfaces and the management of projects, version control and development. Other programming topics are related to web and cloud development and system and hardware programming.

55K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo