Link to home
Start Free TrialLog in
Avatar of gatorvip
gatorvipFlag for United States of America

asked on

Export XLS to CSV, without Excel?

I have an excel file, in XLS format, which is updated daily. The file is stored on a Windows 2003  server that does not have Office installed - I can install if there's no alternative, but I would rather not. This file needs to be exported into CSV automatically. For this, I would use VBScript, but unfortunately the following command raises an error:

Set xlApp = CreateObject("excel.application")

since Excel is not present on the server. The error is Microsoft VBScript runtime error: ActiveX component can't create object: 'excel.application'

1) Is there any reasonable alternative to this process?
2) Can I accomplish this in PowerShell without installing Office?
Avatar of stavros41
stavros41
Flag of United Kingdom of Great Britain and Northern Ireland image

Can you not just copy the file and rename it to .csv with the vbscript? Does it have more than one tab?
Avatar of gatorvip

ASKER

>>Can you not just copy the file and rename it to .csv with the vbscript?

I attached the VBScript code snippet below. Not sure if by "copy the file" you mean manually (the answer is no, as the idea is to automate this process) or automatically (in which case, that's pretty much what I'm doing, via SaveAs).

>>Does it have more than one tab?

No, single worksheet.
Set xlApp = CreateObject("excel.application")
Set FSO = CreateObject("Scripting.FileSystemObject")
...
Set xlWB = xlApp.Workbooks.Open(fil.Path)            
...
xlWB.SaveAs vOutPath & FSO.GetBaseName(fil) & ".csv", 6

Open in new window

This will rename the spreadsheet:
Set objFSO = CreateObject("Scripting.FileSystemObject")
objFSO.MoveFile "C:\excel.xls" , "C:\excel.csv"

Open in new window

That only renames the file, it doesn't actually convert to CSV.
Avatar of Chris Dent
You can potentially use something like this:

http://exceldatareader.codeplex.com/

PowerShell will quite happily load .NET libraries which just leaves converting the C# examples into PowerShell and seeing how well it does.

Want to give it a try?

Chris
I have done something similar before, but it was a lot of code. I would be tempted to create a c# console app and use stream reader/writer to create the csv.

I suspect that's what the library above does.

Chris
Yes, but that code will only read the excel file, you also need to write it back to a csv file.

I realise that. I'd use the library above, or some equivalent, to get the data out of Excel, make it into a nice collection of objects and pass the lot down to Export-CSV.

As far as I'm concerned getting the data out is the hard part, so if the library does that, writing it back to CSV will be far easier in PS than in C#. In my opinion, of course :)

Chris
Agreed
>>As far as I'm concerned getting the data out is the hard part

which is why I'm posting here...

And here's how it would be done using the library above.

Note that $PWD is Present Working Directory, so Excel.dll (downloaded above) and Book1.xls are both there in my example.

Want to give it a spin?

Chris
[Void]([Reflection.Assembly]::LoadFrom("$($PWD.Path)\Excel.dll"))

$Stream = New-Object IO.FileStream("$($PWD.Path)\Book1.xls", "Open", "Read")
$Reader = [Excel.ExcelReaderFactory]::CreateBinaryReader($Stream)

$Reader.IsFirstRowAsColumnNames = $True

$Reader.AsDataSet().Tables[0].Rows | Select-Object * `
    -Exclude RowError, RowState, Table, ItemArray, HasErrors | `
  Export-CSV "SomeFile.csv"

Open in new window

Chris> to clarify, I would need to install the 3rd party library that you linked earlier, correct? If that's the case, I would much rather install Excel/Office altogether, given that this is a production server.

You need to download it and save the Excel.dll file out, but you don't need to install anything at all. All it needs to do is be somewhere PS can get to it.

The first statement above, LoadFrom, loads the DLL into memory, making it's components accessible to the current PS session. Once PS is closed it's gone and it goes back to being nothing more than a file on your system. It doesn't leave anything any deeper than that.

CodePlex is, in my experience, quite safe, another of those run by MS. I've tested the DLL and had a quick look at the source code, there's nothing troubling inside.

It remains, of course, a third-party library. Trepidation is quite warranted.

Chris
Sorry for the delayed response, I haven't had a chance yet to run the script above.


>>$Reader.IsFirstRowAsColumnNames = $True
It seems there will be a singular empty row in the beginning of the excel file, followed by another row containing column names.
ASKER CERTIFIED SOLUTION
Avatar of Chris Dent
Chris Dent
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I haven't gotten around to implement this, but early testing shows that it works.