Link to home
Start Free TrialLog in
Avatar of Luis Diaz
Luis DiazFlag for Colombia

asked on

VB Script: how to convert xls files into csv files

Hello,

I have a xls files and I will like to convert this file into csv format with ";" delimiter without using objExcel = CreateObject("Excel.Application").

This is what I use with CreateObject("Excel.Application")


WorkingDir = "C:\API\mstt-ecube-import\out"
Extension = ".XLS"

Dim fso, myFolder, fileColl, aFile, FileName, SaveName
Dim objExcel,objWorkbook

Set fso = CreateObject("Scripting.FilesystemObject")
Set myFolder = fso.GetFolder(WorkingDir)
Set fileColl = myFolder.Files

Set objExcel = CreateObject("Excel.Application")

objExcel.Visible = False
objExcel.DisplayAlerts= False

For Each aFile In fileColl
ext = Right(aFile.Name,4)
If UCase(ext) = UCase(extension) Then
'open excel
FileName = Left(aFile,InStrRev(aFile,"."))
Set objWorkbook = objExcel.Workbooks.Open(aFile)
SaveName = FileName & "csv"
objWorkbook.SaveAs SaveName, 23
objWorkbook.Close
End If  
Next

Set objWorkbook = Nothing
Set objExcel = Nothing
Set fso = Nothing
Set myFolder = Nothing
Set fileColl = Nothing


Thank you for your help.

Regards,
Avatar of Kimputer
Kimputer

Without Set objExcel = CreateObject("Excel.Application") you have no way to interpret the Excel file.
Do you want an alternative, like OpenOffice? Or you want to do it VBScript only (binary manipulation, highly unlikely it can be done).
Avatar of Luis Diaz

ASKER

Hello,

The problem of using Set objExcel = CreateObject("Excel.Application") is that I cannot running the script with task sheduler and in particular with the option Run whether user is logged on or not, this is why I don't want to use objExcel. Do you have another alternative?

Thank you.
Strange, I would've expected it to work.

My next suggestion is using a console app from VS Express, add a reference to the MS Excel object, and completing the code with

Imports Excel = Microsoft.Office.Interop.Excel

Open in new window

and

     
Dim appXL As Excel.Application
      Dim wbXl As Excel.Workbook
      Dim shXL As Excel.Worksheet
      Dim raXL As Excel.Range

      appXL = CreateObject("Excel.Application")
      appXL.Visible = True
      wbXl = appXL.Workbooks.Add
      shXL = wbXl.ActiveSheet

Open in new window


There's still a createobject, but I hope it won't interfere with task scheduler.
Surprised you can't do this with scheduled task, will have to give it a try.  As has been said you need something to interpret the XLS binary file and then use Excel to Save it as, or to manually loop through the contents and create a text file ... as long as you have Excel on the relevent machine should be OK.

When you say scheduled task, is this a file that gets dropped from some automated process, or wanting a user to put a file in a location and have it converted etc?  Could the user do a process within Excel if so?
I run the script through windows task scheduler on a server.
I can run the script as soon as I am logged into the server. What I want is to be able to run the script with the option "user is logged on or not", otherwise I need to have an active session on the server to be able to run the script. I know that all my vbscript can be run with this option as long as I don't use  CreateObject("Excel.Application").
Thank you for your help.
I have not tried it, but I could see issues trying to run that script without being logged in since it tries to create a new instance of Excel to run.  You can always use ADO to connect to an Excel File (I have had to use this for code where Excel is not installed on the machine).  You will have to read the data and then write out line by line to a csv, but it should work.

Give this a try:

' ExpertExchange Question ID Q_28618427
' http://www.experts-exchange.com/Programming/Languages/Visual_Basic/VB_Script/Q_28618427.html
' Expert: ltlbearand3 [http://www.experts-exchange.com/M_2469312.html]
'
Option Explicit

Dim objConn, rsData,strConnString, rsSheet, strSheetName, adSchemaTables
Dim fso, myFolder, fileColl, aFile, FileName, SaveName, WorkingDir, Extension
Dim objNewFile, i, strLine

Set objConn = CreateObject("ADODB.Connection")
Set rsData = CreateObject("ADODB.Recordset")
adSchemaTables = 20

WorkingDir = "C:\API\mstt-ecube-import\out"
Extension = "xls"

Set fso = CreateObject("Scripting.FilesystemObject")
Set myFolder = fso.GetFolder(WorkingDir)
Set fileColl = myFolder.Files

For Each aFile In fileColl
	If fso.GetExtensionName(aFile) = Extension then
		Set objNewFile = fso.CreateTextFile(WorkingDir & "\" & fso.GetBaseName(aFile) & ".csv")
	
		' Check if Connection is already Open
		strConnString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & aFile
		strConnString = strConnString & ";Extended Properties=""Excel 8.0;HDR=NO;"""
		
		objConn.Open strConnString
		
		' Get the Name of the Sheet
		Set rsSheet = objConn.OpenSchema(adSchemaTables)
				
		If Not rsSheet.EOF Or Not rsSheet.BOF Then
		    strSheetName = rsSheet.Fields("TABLE_NAME").Value
		End If

		' Execute the Query
		rsData.Open "Select * from [" & strSheetName & "]", objConn
		
		Do While Not rsData.EOF
			strLine = ""
			For i = 0 to rsData.Fields.Count - 1
				strLine = strLine & rsData.Fields(i).Value & ","
			Next
			strLine = Left(strLine, Len(strLine) - 1)
			objNewFile.WriteLine strLine
			rsData.MoveNext
		Loop
		
		objNewFile.Close
		objConn.Close
	End If
Next

Open in new window

@ltlbearand3

It seems that I can not access ADODB.Connection, I have an error message at line 30, could please let me know how should I proceed? My version of Excel is 2007.
Most computers accessing .xls files still have the Jet Driver installed.  However, some newer 64 bit machines that have Office 2007 or higher do not.  You may need to install the Jet Driver.  Here is a Microsoft article about verifying the files are installed and registered - http://support2.microsoft.com/default.aspx?scid=kb;en-us;239114.

First verify the .dll is on your system:
    1. Click Start, and then click Search.
    2. In the Search Results pane, click All files and folders under Search Companion.
    3. In the All or part of the file name box, type Msjet40.dll, and then click Search.
    4. In the list of files, right-click the Msjet40.dll file that is located in the Windows\System32 folder or in the Windows\System folder, and then click Properties.
    Note On 64-bit versions of Windows operating systems, the Msjet40.dll file is in the Windows\SysWOW64 folder.

You can also check the registry (Start >> Run >> Regedit) and look for HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Jet 4.0 on a 64 bit machine or HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Jet 4.0 on a 32 bit machine.

If this is not installed, please use the link above to install it.

Also, please verify that these are .xls and not .xlsx files as that changes the code.
Hello,

The msjet40.dll is already in my computer, at "C:\Windows\SysWOW64". And the file extension is "xls". What can be wrong?
Can you verify that it listed in the registry.  You may need to register the DLL and I can walk you through that if needed.
Hi, see if this connection string works for you:
            strConnString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & aFile & ";Extended Properties=""Excel 12.0;HDR=Yes;IMEX=1;"";"

Otherwise, you may need to install some components.  Try installing this first:
http://www.microsoft.com/download/en/details.aspx?id=13255
 
And if that doesn't work for you, try installing the following component:
http://www.microsoft.com/download/en/confirmation.aspx?id=23734

Regards,

Rob.
Hello experts,

Thank you for your comments, I will test it tomorrow.

Regards,
Hello Experts,

@Itibrands: could you please help me how to register the dll. Probably something that can block is that I have an office verion of 32 bit and msjet40.dll  placed on  Windows\SysWOW64

@Rob: I tried to install both components and still not working thank you anyway for the recommendations.
Bears code uses the JET engine,
Mine uses the JET engine. Did you try both after installing the components?
Sorry, fumble fingers today! Mine uses the ACE engine.
@LD16 - Lets try a few things in order.  Keep going down the list if it still does not work
1.  Change the line that sets the connection string (strConnString=) to
strConnString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & aFile & ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1;"";"

Open in new window

.  Please post any error message you receive.

2.  This may differ depending on the version of windows you have installed (Please post that to assist us).  Since you want help registering the dll, I trust you verified that it was not in the registry.  To register the JET engine click Start >> Run Then put in these commands (Run one at a time).  You should get a message that the dll registered succesfully.  (Afterword, please verify the registry entry exists)
c:\Windows\SysWOW64\regsvr32.exe Msjet40.dll
c:\Windows\SysWOW64\regsvr32.exe msjetoledb40.dll
c:\Windows\SysWOW64\regsvr32.exe mswstr10.dll
c:\Windows\SysWOW64\regsvr32.exe msjter40.dll
c:\Windows\SysWOW64\regsvr32.exe msjtes40.dll
c:\Windows\SysWOW64\regsvr32.exe msjint40.dll

3.  Go into the installed program list (depending on settings it is found in Start >> Control Panell >> Programs & Features)
verify that the "Microsoft Office Access Database engine 2007" and "Microsoft Office Access Database engine 2010" are installed.

4.  Run my original code and post the error message you receive.

5.  Run it with Rob's change and post the error message you receive.


@Rob - my experience has been that the ACE engine does not work very well on the older Excel files (.xls).  That is why i used the Jet engine here.  Has your experience been different in that the ACE works well on both types?  (Just curious for my own benefit).
Crap!  Excuse my French but I just reliased why it didn't work!  I had used these connection strings in the past, and *knew* they worked, so when I was testing my code on my system, I was baffled as to why I was getting the same error message:
ADODB.Connection: Provider cannot be found. It may not be installed properly.


Sorry ltlbearand3, I can't remember exactly which file types I've used with which engines, but I *think* since Office 2007 I've always used the ACE engine, including for XLS.

Anyhow, the problem is because you're on a 64 bit system, with Office 2007 (and data access components) installed as 32 bit.

So, when executing the script, the default script engine is the 64 bit one, being executed from C:\Windows\System32\cscript.exe (or wscript.exe)
It therefore does not have access to the components in the 32 bit space.

So the fix is to explicitly run either
C:\Windows\SysWOW64\cscript.exe C:\Scripts\ConvertFile.vbs
or
C:\Windows\SysWOW64\wscript.exe C:\Scripts\ConvertFile.vbs

Try that out, with various connection strings, and see how you go.

Regards,

Rob.
Hello experts,

I will follow all the recommendations tomorrow.
Thank you for your help!
AARGH - it is sometimes the little things that trip us up.  Your right Rob - I was running my tests in Notepad ++ which I have pointing to the 32bit driver and did not test outside of it.  Sure enough - would not run.  Need to use the 32 bit scripting engine.

LD16 - try using that as your first test.
Hello,

Finally We got it, I was able to run the script with the following recommendation
C:\Windows\SysWOW64\wscript.exe C:\Scripts\ConvertFile.vbs

Now I additional problems:

1-I am not able to convert the file that I want because. This xls file is automatically generated from HTML and every time when I open the file I have the corrupt file message display by Excel (this is not related to the script generation) The file is like that.
User generated image
2-I test the script after saving the file and get ride of the corrupt message and it works perfectly. The thing is how I can make run the script even with the corrupt message?

3-The delimiter should be ";" and not "," Changing the strLine = strLine & rsData.Fields(i).Value & ";" is it enough?

Thank you again for your help.
Hmmm, so with problem 1, is the file extension XLS, or is it HTML?  Does the script actually give you an error?

For point 2, if you have Excel installed, we could script it to open the file and resave it, then continue with the script, but that would require using Excel.Application, which you didn't want to do.

For point 3, yes, that would be enough.

Rob.
1-The file is in a xls format however is extracted from a java application.
Here is the message that I got:
User generated image2-Could you please  provide the code that allows you to make the save with a Excel.Application?
The script works perfectly for the rest of pure xls files so I will apply it and in case of exception I use the Excel.Application save.
3-Is there a way to add a loop to your current version of the code, such as if error "External Table is not in the expected format, make a save and then continu with the conversion?

Thank you again and sorry about that.
1) Can I possibly get sample file that exhibits this behaviour to test with?  There may be some different switches required depending on how it needs to be opened.

2) Yes, we can do that, and

3) Yes, we can catch the line 30 error and only use the "Save As" when required.

Regards,

Rob.
I have seen this problem before, but forget how I got around it.  If you can post the sample file that will help (If I remember correctly the way some web developers create the excel file makes it not accessible via ADO.  I think the data is all in HTML and we have to read it using those objects).

For option #2, this will get you back to the problems that you had originally - it will error through the task scheduler.  Therefore, we have to find another solution.  I will try and run some tests tonight to see if we can tackle getting the Excel Object to work with Task Scheduler.  If we can make that work, we can go back to your original code and skip ADO.
As far as I have ever been able to figure out, the Excel.Application COM object must have an interactive session to work, so the user running the task must be logged on (can be locked though).

I didn't realise this was to be scheduled, so yes, that would cancel out being able to use Excel.Application.

Once we see a sample file, we'll work something out.

Rob.
Sorry I thought I have attached the file yesterday.

Please find attached the file.
test.xls
Hi, I'm not sure that's a very good sample.  Since we cannot use the Excel.Application object, we won't be able to simply force it to open, then resave it.  We will need to parse the HTML text.  However, from that sample, the only text that is in it is the following:
<html 
xmlns:o="urn:schemas-microsoft-com:office:office" 
xmlns:x="urn:schemas-microsoft-com:office:excel" 
xmlns="http://www.w3.org/TR/REC-html40">
   <HEAD>
      <meta http-equiv=Content-Type content="text/html; charset=UTF-8">
      <STYLE TYPE="text/css">
      <!--
 .st0 { background-color: #FFFFFF; color: #000000; font-family: Arial,SansSerif; font-size: small;}
 .st1 { background-color: #FFFFFF; color: #000000; text-align: left; font-family: Arial,SansSerif; font-size: small;}
      -->
      </STYLE>
   </HEAD>
   <BODY>
      <TABLE BORDER=0 CELLSPACING=0>
         <TR>
            <TD class="st0">&nbsp;</TD>
            <TD class="st1">&lt;Choose a report from the Current Report dropdown&gt;</TD>
         </TR>
      </TABLE>
   </BODY>
</HTML>

Open in new window


So there is no data in it.

If you could provide something that actually has some data so we can see the structure, we can see what we can do.

Rob.
Second example:

Please let me know if you need additional info.
test2.xls
OK, that gives me more detail.  So in such a file (if you change the extension to HTML you can view the data), are you only interested in the header row and data?  Or do you need the Title, Date, and Accounting System fields?

If you're only interested in the table of data (which would be easier), is the data in each report a consistent layout like this?

Rob.
To give you an idea, the code I have just knocked up to convert that table to a CSV is as follows.  Run this with that test and you will be able to open the output with Excel.

Regards,

Rob.

strInputFile = "C:\Temp\Scripts\Test2.xls"
strOutputFile = "C:\Temp\Scripts\Test2.csv"
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile(strInputFile, 1, False)
strContents = Replace(objFile.ReadAll, vbCrLf, "")
objFile.Close
Set objOutput = objFSO.CreateTextFile(strOutputFile, True)
intRowPos = InStr(1, strContents, "<TR>", vbTextCompare)
While intRowPos > 0
	strContents = Mid(strContents, intRowPos)
	strTableRow = Left(strContents, InStr(1, strContents, "</TR>") + 4)
	If InStr(1, strTableRow, "COLSPAN", vbTextCompare) = 0 Then
		strLine = ""
		intCellPos = InStr(1, strTableRow, "<TD", vbTextCompare)
		While intCellPos > 0
			strTableRow = Mid(strTableRow, intCellPos)
			strTableRow = Mid(strTableRow, InStr(1, strTableRow, ">", vbTextCompare) + 1)
			strValue = Left(strTableRow, InStr(1, strTableRow, "<", vbTextCompare) - 1)
			If strLine = "" Then
				strLine = """" & strValue & """"
			Else
				strLine = strLine & ",""" & strValue & """"
			End If
			intCellPos = InStr(1, strTableRow, "<TD", vbTextCompare)
		Wend
		objOutput.WriteLine strLine
	End If
	strContents = Mid(strContents, Len(strTableRow))
	intRowPos = InStr(1, strContents, "<TR>", vbTextCompare)
Wend
objOutput.Close

Open in new window

Hello Rob,
I need all the report including headers from this report  however the conversion concerns exclusively this report.
SOLUTION
Avatar of RobSampson
RobSampson
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hello Rob,

I tested you first code and it works however it doesn't have the

WorkingDir = ""
Extension = "xls"

Concerning your second code I have an error message in line 29 char 18 "syntax error". Could you please help me to debug your second code?

Thank you  for your help
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial