Dale Fye
asked on
Export Excel worksheet as CSV with pipe "|" character as delimiter
I cannot find the previous question I asked about this. Someone provided a process by which I could change the delimeter for Excel "CSV" files from a comma to the pipe character. I need this because some of the text fields contain embedded commas and I don't want to change those, but I need the data in a CSV format for uploading using the SQL Server Bulk Insert command.
At any rate, when I open the Excel spreadsheet manually and save the file as a CSV, (also manually), I get the pipe delimiter. But if I open the spreadsheet via Access automation and save the worksheet:
osht.SaveAs Replace(oWbk.FullName, ".xlsx", ".csv"), xlCSV
I get a comma separated file instead of pipe separated. Can anyone provide any recommendations which will allow me to automate saving the selected sheet of the XL file as a pipe delimited file?
Dale
At any rate, when I open the Excel spreadsheet manually and save the file as a CSV, (also manually), I get the pipe delimiter. But if I open the spreadsheet via Access automation and save the worksheet:
osht.SaveAs Replace(oWbk.FullName, ".xlsx", ".csv"), xlCSV
I get a comma separated file instead of pipe separated. Can anyone provide any recommendations which will allow me to automate saving the selected sheet of the XL file as a pipe delimited file?
Dale
Does the process automatically add quote marks around any field that contains a comma character?
Dale
What settings/options/steps are you using when you manually save the file from Excel to end up with a pipe-delimited file?
Have you tried recording a macro when you follow that process?
What settings/options/steps are you using when you manually save the file from Excel to end up with a pipe-delimited file?
Have you tried recording a macro when you follow that process?
I get a comma separated file instead of pipe separated. Can anyone provide any recommendations which will allow me to automate saving the selected sheet of the XL file as a pipe delimited file?
probably we can change the system settings for List Separator under Regional settings.
see if that work after made the changes?
I think with a little tweak to this solution you should be able to insert a custom delimiter
When you export the table to Text, in the wizard select Advanced.
Change the delimiter to |
Then finish.
See attached example, please.
/gustav
FormsTable.txt
Change the delimiter to |
Then finish.
See attached example, please.
/gustav
FormsTable.txt
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
It was a recommendation similar to Ryan's and changing the list separator in the Regional settings which is allowing the export to work properly when I perform this task manually:
1. Open file in Excel
2. Click File => SaveAs
3. Select the CSV format from the save as dialog
4. Click OK.
But for some reason, when I open the file with VBA automation, and perform a sheet.SaveAs operation, it does not use that list separator format, with the pipe.
@Aikimark, when performing a true CSV export (with comma delimiters), it does. But this does not help, as the SQL Server Bulk Insert process does not recognize these quotes and still separates the fields based on the embedded commas. That is why I want to use the pipe as the separator, because there are no fields in the data that contain a pipe character.
@John, Yes, I could use that technique, but the files that this procedure will act upon may contain as many as 250,000 records, and I was hoping that I could get the sheet.SaveAs method to work properly with this delimiter.
@Gustav, the manual process works now, it is the automated process which is not working properly.
1. Open file in Excel
2. Click File => SaveAs
3. Select the CSV format from the save as dialog
4. Click OK.
But for some reason, when I open the file with VBA automation, and perform a sheet.SaveAs operation, it does not use that list separator format, with the pipe.
@Aikimark, when performing a true CSV export (with comma delimiters), it does. But this does not help, as the SQL Server Bulk Insert process does not recognize these quotes and still separates the fields based on the embedded commas. That is why I want to use the pipe as the separator, because there are no fields in the data that contain a pipe character.
@John, Yes, I could use that technique, but the files that this procedure will act upon may contain as many as 250,000 records, and I was hoping that I could get the sheet.SaveAs method to work properly with this delimiter.
@Gustav, the manual process works now, it is the automated process which is not working properly.
Personally i wouldn't tamper with system settings to get the desired functionality ...you can always forget something and when everything just stops to work then......if the fso crashes with many records its a different case...i will try to check...
My poor old G620 had a hard time handling 260,000 records but it pulled it off just fine...so fso works ...for 250K+ records
ASKER
how long did that take, John? And how many columns did you use in your test?
20 columns 259400 rows...time .... a lot...it was not a quick operation.....probably would be faster if it was running from inside Excel...
To echo John's sentiment regarding change of system settings - yes it works, but...it also replaces the delimiter when composing formulae, so instead of writing
Feels wrong!
IF(A2="Y","Hello","Bye")
, you need this
IF(A2="Y"|"Hello"|"Bye")
Feels wrong!
Which SQL Bulk Loader?
ASKER
@Sham, Interesting, That is a big deal. But since the user will be uploading up to a dozen of these files at a time, long running loops like John suggested will lock up their computer, potentially for hours. Need something that works better. Maybe there is a way to change that system setting briefly, while the code runs, and then set it back, after it completes. More testing needed.
@aikimark, Bulk Insert
Unfortunately, the Bulk Insert process of SQL Server identifies the fieldterminator (whether embedded inside quotes or not) as a field terminator, so it would break:
a, "abc,def", 1
into 4 columns
a, abc, def, 1
when performing the bulk insert.
@aikimark, Bulk Insert
declare @SQL nvarchar(max)
set @SQL = 'Bulk Insert #AL FROM ''' + @FilePath + @Filename + ''' ' +
'with (FirstRow = 2, Datafiletype = ''char'', Fieldterminator = ''|'', Rowterminator = ''\n'')'
exec @SQL
WHERE #AL is defined as a temp table with all the fields are defined as nvarchar( ). After importing into this temp table, the SP performs an Append query that performs data type translations between the nvarchar( ) in the temp table and the actual field types in the reporting database table.Unfortunately, the Bulk Insert process of SQL Server identifies the fieldterminator (whether embedded inside quotes or not) as a field terminator, so it would break:
a, "abc,def", 1
into 4 columns
a, abc, def, 1
when performing the bulk insert.
if speed is the concern then you need to code the insert directly from Excel avoiding the in between..
If you use the CSV file exported from Excel, what happens if you perform the bulk import and add the following parameters:
FORMAT = 'CSV'
FIELDQUOTE = '"'
ASKER
aikimark,
Have not had a chance to play with that yet. Will test sometime over the weekend.
Have not had a chance to play with that yet. Will test sometime over the weekend.
SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
aikimark,
re: your comment
SQL Server 2008 R2 will not accept those settings in the Bulk Insert command
FORMAT = 'CSV'
FIELDQUOTE = '"'
I'm going to have to go back to the drawing board, and see whether the client can get the original file, exported from their accounting software in a CSV file, and determine whether there is a way for them to strip commas from the fields during the export process. If not, I'm probably going to have to read the CSV files and parse out the commas embedded within text fields, and then remove the embedded quotes. Don't know how long this will take, but will give it a test later this week.
I also found that Excel was converting several text fields that look like numbers to numbers, and then reformatting them as scientific notation, which corrupted the data as well.
re: your comment
SQL Server 2008 R2 will not accept those settings in the Bulk Insert command
FORMAT = 'CSV'
FIELDQUOTE = '"'
I'm going to have to go back to the drawing board, and see whether the client can get the original file, exported from their accounting software in a CSV file, and determine whether there is a way for them to strip commas from the fields during the export process. If not, I'm probably going to have to read the CSV files and parse out the commas embedded within text fields, and then remove the embedded quotes. Don't know how long this will take, but will give it a test later this week.
I also found that Excel was converting several text fields that look like numbers to numbers, and then reformatting them as scientific notation, which corrupted the data as well.
ASKER
Thanks for the suggestions, I ended up with a solution similar to each of these.
I opened the file as a text file using VBA, read each row of the text file as a string, then parsed each string on the commas, then reconstructed the string, with some logic that accounted for string segments that started or ended with a quote.
"ABC,DEF"
would be broken into array elements "ABC and DEF", so if the array element started with a quote it accounted for that and instead stripping the leading quote and inserting a comma , it simply stripped the leading quote. Then as the next element was read, it would see that it ends with a " and would strip that quote and append that array element to the previous, followed by a comma.
But then I found some records were a field contained "ABC, DEF, GHI", so I had to add logic that would account for multiple commas embedded within a single field and return ABC DEF GHI.
I'll be writing an article on this process it shortly.
By replacing the embedded commas with spaces and then stripping double spaces I was able to accomplish what I was looking for.
I opened the file as a text file using VBA, read each row of the text file as a string, then parsed each string on the commas, then reconstructed the string, with some logic that accounted for string segments that started or ended with a quote.
"ABC,DEF"
would be broken into array elements "ABC and DEF", so if the array element started with a quote it accounted for that and instead stripping the leading quote and inserting a comma , it simply stripped the leading quote. Then as the next element was read, it would see that it ends with a " and would strip that quote and append that array element to the previous, followed by a comma.
But then I found some records were a field contained "ABC, DEF, GHI", so I had to add logic that would account for multiple commas embedded within a single field and return ABC DEF GHI.
I'll be writing an article on this process it shortly.
By replacing the embedded commas with spaces and then stripping double spaces I was able to accomplish what I was looking for.