xlsx/xls -> csv conversion

Hello,

***UPDATE****
As I use OpenOffice on Windows 7 for manually making the work that follows, maybe that an OpenOffice Guru can find a better solution than a Perl script, so I've added also OpenOffice in the topics
***************

I'm searching for a good solution for converting Windows xlsx/xls files (Windows 1252 WinLatin 1) in csv (UTF-8) files with a script like:

perl script.pl<file.xlsx>file.csv

my environment is Windows 7 with ActiveState Perl 5.24.0

The script should also "cut-off" the first line & the first column of every Excel spreadsheet

Thanks a lot,

Fabiano
fabiano petroneAsked:
Who is Participating?
 
Bill PrewConnect With a Mentor Commented:
This would be TAB delimited, see if this works for you.

Const xlCSV = 6
Const xlTextWindows = 20

strDir = "B:\EE\EE29066306\Files"

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = False
objExcel.DisplayAlerts = False

For Each objFile In objFSO.GetFolder(strDir).Files
    strExt = objFSO.GetExtensionName(objFile.Path)
    If LCase(Left(strExt, 3)) = "xls" Then
        Wscript.Echo "Reading file: [" & objFile.Path & "]"
        Set objWorkbook = objExcel.Workbooks.Open(objFile.Path, False, True)
        strPath = Replace(objFile.Path, "." & strExt, ".csv")
        Wscript.Echo "Writing file: [" & strPath & "]"
        objWorkbook.SaveAs strPath, xlTextWindows
        objWorkbook.Close False
        Set objWorkbook = Nothing
    End If
Next

objExcel.Quit
Set objExcel = Nothing
Set objFSO = Nothing

Open in new window


»bp
0
 
Bill PrewCommented:
Just to be clear, you don't have Microsoft Office installed, only Apache OpenOffice?


»bp
0
 
tel2Commented:
Hi fabiano,

After you've answered Bill's question, here's a possible starting point for .xls files:

Have a look at this script on CPAN:
  http://search.cpan.org/~ken/xls2csv-1.06/script/xls2csv
Here's how it's run:
  xls2csv -x "1252spreadsheet.xls" -b WINDOWS-1252 -c "ut8csvfile.csv" -a UTF-8

However, this is just a starting point, because:
- It seems to handle .xls files only, not .xlsx.
- It would need some modification to cut off row 1 & column A.
If you don't get any better offers, you might like to try it to see if it works in all other respects, though, and then report back.

For xlsx files, maybe these posts from PerlMonks might help:
  http://perlmonks.org/?node=xlsx+csv

And the Spreadsheet::Read module seems to be compatible with .xls, .xlsx, etc.
  http://search.cpan.org/~hmbrand/Spreadsheet-Read-0.74/Read.pm

What sheet of the workbook should your script work on?  Just the first one?  What would that sheet be called?  "Sheet1"?
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

 
fabiano petroneAuthor Commented:
Hi, Bill
yes: for now only OOffice and the files are now "manually saved" in UTF8 csv format (with "tab" as field separator).
To tell the truth I can also go to Excel if no solutions are available: do you have a script for this platform?
to summarize, I need:
1) a command line script that has the xls file as an input and the csv file as an output, say something:
script <file.xls>file.csv
In fact I've a lot of files to process, and I want to put all the lines in an unique .bat
2) the script should crop the first line & the first column of every xls file
2) the content of the xls file should be saved in csv, UTF8 encoding & "tab" separator

Thanks a lot,
Fabiano
0
 
fabiano petroneAuthor Commented:
Hi, tel2
Thanks for the suggestions.
Regarding xlx2csv I've installed it on my active perl environment, but your sample give me a "bad parameters" error.
calling the .pl file alone I obtain the following output:

Usage: C:\Perl64\eg\alma\personale\test\xls2csv.pl filename sheet-colrow:colrow
[-rotate]
   eg: C:\Perl64\eg\alma\personale\test\xls2csv.pl filename.xls 1-A1:B12
       C:\Perl64\eg\alma\personale\test\xls2csv.pl filename.xls A1:M1 -rotate

Open in new window


so it seems that I must know the number of the rows and of the columns before launching the script, and that's no good.
I'll give a try also to your other suggestions: Thanks
Fabiano
0
 
Bill PrewCommented:
In Windows, with Microsoft Excel installed, this would be pretty easy to do in a VBscript (VBS file).  Since you have a number of files to process, I would say let the script process all the files in a given directory, this would be more efficient since a copy of Excel would not have to be started and stopped for each file, the same instance could be used.

I don't have one that would be exactly what you are looking for, but this is pretty close and could be a good starting point if we went this way.  Just to give you an idea of the coding.  It could be enhanced to make the other changes to the file you need inside the loop.

Const cExcelCSV = 6

strDir = "C:\Temp\Files"

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = False
objExcel.DisplayAlerts = False

For Each objFile In objFSO.GetFolder(strDir).Files
    strExt = objFSO.GetExtensionName(objFile.Path)
    If LCase(Left(strExt, 3)) = "xls" Then
        Set objWorkbook = objExcel.Workbooks.Open(objFile.Path, False, True)
        objWorkbook.SaveAs Replace(strPath, "." & strExt, ".csv"), cExcelCSV
        objWorkbook.Close False
        Set objWorkbook = Nothing
    End If
Next

objExcel.Quit
Set objExcel = Nothing
Set objFSO = Nothing

Open in new window



»bp
0
 
fabiano petroneAuthor Commented:
Hi, Bill
Thanks a lot for your script. I've modified only the line:
strDir = "C:\Perl64\eg\alma\files"
Then I've launched the script and I've obtained the attached error (an 800A03EC error)
sorry for the italian.
I try to traslate:
row 15 char 9: impossible to open the file.
try one of the following options: etc.
Thanks,
Fabiano
error.jpg
0
 
Bill PrewCommented:
Sorry, missed a change as I re-purposed a prior script, adjust the following line:

       objWorkbook.SaveAs Replace(objFilePath, "." & strExt, ".csv"), cExcelCSV


»bp
0
 
fabiano petroneAuthor Commented:
hi,
just substituted the line but I've obtained the same error.
I enclose also a sample.xls so you can test your script
Thanks,
Fabiano
sample.xls
0
 
Bill PrewCommented:
Very sorry, seem to have lost a period there...

objWorkbook.SaveAs Replace(objFile.Path, "." & strExt, ".csv"), cExcelCSV

Assuming you are running this on Windows you might want to run from a command prompt window too, with a command line like:

cscript yourscript.vbs

That way you will see any errors or other logging we might add to the console.


»bp
0
 
fabiano petroneAuthor Commented:
Hi,
done, but the output is more or less the same.
I enclose a screenshot
bye,
Fabiano
console.jpg
0
 
Bill PrewCommented:
Did you add the period into line 15 per my last post, that fixed it here.


»bp
0
 
Bill PrewCommented:
Here is a version with a couple of displays of the input and output files it processes, run this if still getting error.

Const cExcelCSV = 6

strDir = "B:\EE\EE29066306\Files"

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = False
objExcel.DisplayAlerts = False

For Each objFile In objFSO.GetFolder(strDir).Files
    strExt = objFSO.GetExtensionName(objFile.Path)
    If LCase(Left(strExt, 3)) = "xls" Then
        Wscript.Echo "Reading file: [" & objFile.Path & "]"
        Set objWorkbook = objExcel.Workbooks.Open(objFile.Path, False, True)
        strPath = Replace(objFile.Path, "." & strExt, ".csv")
        Wscript.Echo "Writing file: [" & strPath & "]"
        objWorkbook.SaveAs strPath, cExcelCSV
        objWorkbook.Close False
        Set objWorkbook = Nothing
    End If
Next

objExcel.Quit
Set objExcel = Nothing
Set objFSO = Nothing

Open in new window


»bp
0
 
fabiano petroneAuthor Commented:
Hi, Bill
now it's OK: but the separator should be the "tab" and not the "comma" (see my 2h ago comment).
Thanks,
Fabiano
0
 
Bill PrewCommented:
Yes, I wasn't proposing this as a full solution, just a sample of how it would look.

If you think this could work for you I will adjust further.


»bp
0
 
fabiano petroneAuthor Commented:
Hi,
yes, surely it could work for me: if you insert the "tab" separator instead of the "comma" it's perfectly suited to my needs
Thanks,
Fabiano
0
 
fabiano petroneAuthor Commented:
Hi, perfect Thanks :=)
0
 
tel2Commented:
Hi Bill,
Well done in your skill and persistence in solving this problem!
I don't know much VBA, but what part of your code does this:
   '"cut-off" the first line & the first column of every Excel spreadsheet'?
And what part does converts from Windows 1252 WinLatin 1 to UTF-8?
And does your code just process the first sheet of each workbook, or what?

Hi Fabiano,
> 'Regarding xlx2csv I've installed it on my active perl environment, but your sample give me a "bad parameters" error.'
Without seeing the exact command you issued, I can't help you.  I've just tested it myself, with the sample parameters I gave in my post, and it worked for me on Linux.  But it seems you have a simpler and more complete solution from Bill now, anyway.

tel2
0
 
Bill PrewCommented:
Here is a version that will remove the tor row and first column.

The code page conversion should have been happening by default, but just in case here is a small adjustment to make sure.

Const xlCSV = 6
Const xlTex = -4158
Const xlTextWindows = 20
Const msoEncodingUTF8 = 65001

strDir = "B:\EE\EE29066306\Files"

Set objFSO = CreateObject("Scripting.FileSystemObject")

With CreateObject("Excel.Application")
    .Visible = False
    .DisplayAlerts = False

    For Each objFile In objFSO.GetFolder(strDir).Files
        strExt = objFSO.GetExtensionName(objFile.Path)
        If LCase(Left(strExt, 3)) = "xls" Then
            Wscript.Echo "Reading file: [" & objFile.Path & "]"
            With .Workbooks.Open(objFile.Path, False, True)
                With .Sheets(1)
                    .Rows(1).Delete
                    .Columns(1).Delete
                End With
                strPath = Replace(objFile.Path, "." & strExt, ".csv")
                Wscript.Echo "Writing file: [" & strPath & "]"
                .WebOptions.Encoding = msoEncodingUTF8
                .SaveAs strPath, xlTextWindows
                .Close False
            End With
        End If
    Next

    .Quit
End With

Set objFSO = Nothing

Open in new window



»bp
0
 
tel2Commented:
Nice work, Bill!  Seems to work for me.  Give yourself a pay rise.

BTW, what's the main advantage of doing this in VBScript vs VBA?  Is it just to be able to have the program in a little plain text file without the need to embed it in a spreadsheet?
Would the code be much different to doing in VBA?

TRS
0
 
Bill PrewCommented:
Code would be very similar, yes.  For "batch" things like this I find VBS a better fit, where there is no need for a UI, etc.


»bp
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.