# Import of CSV file into SQL using OPENROWSET

I am using the following to import CSV files into a SQL table...

select * from OpenRowset('MSDASQL', 'Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=C:\CSVDIR\; ','select * from sample.csv')

These CSV files are generated by a program and some are created but empty.

My problem:
The query above fails when I hit an empty CSV file, either with or without column.

I also want to count the number of records

select count(col1) from openrowset...

Can the geniuses help me with a solution?  I prefer some error handling.
Microsoft SQL ServerMicrosoft SQL Server 2005SQL

Last Comment
VicZ

8/22/2022 - Mon
EugeneZ

can you get data from csv file with data and columns?
VicZ

Use excel, load it with data of different values (number, dates, char, etc.) save it as a text delimeted CSV and use my query above to pull it into SQL. If the table (CSV) has ZERO data it will give an error...

OLE DB provider "MSDASQL" for linked server "(null)" returned message "[Microsoft][ODBC Text Driver] Query must have at least one destination field.".

I need a work around, even some error handling:

if table = nul OR count(col1) = 0
then next file, ignore (whatever)

I AM RAISNIG THE POINTS ON THIS ONE
Mark Wills

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
VicZ

Mark, first option looks good but I have applied something similar, the only problem, is that I am trying to stay away from #temp tables and linked servers. I am processing thousands of CSV files everyday, #temp tables and linked servers looped in the thousands bogs my system down.

I used variable tables on a similar solution using DBF (Fox) files and it works fine because the variables reside in memory and drop after every loop. The CSV driver for SQL is failing me when there is an empty value (table, row or column), I run a check script that points out the empties, but it takes me time to fix these everyday.

I am looking at your second option, I like the command shell, let me get back to you.
Mark Wills

Oh, once the linked server is created you dont have to create it every time, it will be able to access all the csv's, so you could loop through each csv and put each one into the temp table (just add table name). But know what you mean about temp tables. Let me know if you want some looping type logic in there, just did the one file specifically as an exple for the openrowset select.

VicZ

Option 2 is working well with the file size.

What other attributes can you throw there?

VicZ

Option 3 is also good to catch a yey or ney. Good work!

I want to give you the points. Just throw some more attributes for that command shell option,

I have a loop. If I need other help on that I will open another question.
Mark Wills

What other attributes do you want ?

at the DOS prompt do a help for

You will notice that I use a "" to delimit the different attibutes. adding   %~ti  for date/time is a popular one as well...

    %~I         - expands %I removing any surrounding quotes (")
%~fI        - expands %I to a fully qualified path name
%~dI        - expands %I to a drive letter only
%~pI        - expands %I to a path only
%~nI        - expands %I to a file name only
%~xI        - expands %I to a file extension only
%~sI        - expanded path contains short names only
%~aI        - expands %I to file attributes of file
%~tI        - expands %I to date/time of file
%~zI        - expands %I to size of file
%~$PATH:I - searches the directories listed in the PATH environment variable and expands %I to the fully qualified name of the first one found. If the environment variable name is not defined or the file is not found by the search, then this modifier expands to the empty string The modifiers can be combined to get compound results: %~dpI - expands %I to a drive letter and path only %~nxI - expands %I to a file name and extension only %~fsI - expands %I to a full path name with short names only %~dp$PATH:I - searches the directories listed in the PATH
environment variable for %I and expands to the
drive letter and path of the first one found.
%~ftzaI     - expands %I to a DIR like output line

VicZ

AWESOME. You earned it mark. Thank you fro your help.

(Now I just have to figure out how to list subdirectories on that cmdshell script)
VicZ

All three options work great, they just have different levels of approach.
VicZ

Hey Mark I am having a problem using option 3 with an IF condition, can you help me with an example?

Mark Wills

Well, the try kinda replaces the IF, so not sure what you want to do with the if - can you describe it a bit more ?
VicZ

I want to cycle thru all my CSV files and list the name with row number total or zeros...

set @count = 0
set @filename char
while count < than 100
begin try
exec('select @filename as filename, count(col1) as totalrows from OpenRowset(''MSDASQL'', ''Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=C:\EE\; '',''select * from any.csv'')')
end try
begin catch
select @filename + ' = No rows'
end catch

set @count = @count +1
end

...or something like that, I need to display filename, number of rows or nulls (empty csv, not even with columns)

Mark Wills

OK, so how about combining parts of methods 2 and 3 (and note how it is now recursing subdirectores with the /R qualifier)

declare @tmp table (id int identity, results varchar(255),filename varchar(255),filesize varchar(100), processed datetime)
insert @tmp (results)
exec master..xp_cmdshell 'for /R "c:\ee" %i in (*.csv) do @echo %~fi""%~zi""%~ti'

update @tmp set filename = substring(results,1,charindex('""',results)-1), filesize = substring(results,charindex('""',results)+2,10)

declare @filename varchar(255)
set @filename = (select top 1 filename from @tmp where processed is NULL  order by id)

while isnull(@filename ,'') > ' .'
begin
update @tmp set processed = getdate() where filename = @filename
begin try
exec('select '''+@filename+''' as filename,count(*) as rowcounter from OpenRowset(''MSDASQL'', ''Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=C:\ee; '',''select * from '+@filename+''')')
end try
begin catch
select @filename as filename, 0 as rowcounter
end catch
set @filename = isnull((select top 1 filename from @tmp where processed is NULL order by id),NULL)
end
GO

Mark Wills

Sorry about that, was playing with the filedate so the above is not quite right...

Also, thought it might be best to capture the reasons why something failed the "try" block, so... have a look below :

(and please ignore the code highlighting - the colours dont always make sense for SQL)...


declare @tmp table (id int identity, results varchar(255),filename varchar(255),filesize int, processed datetime)
insert @tmp (results)
exec master..xp_cmdshell 'for /R "c:\ee" %i in (*.csv) do @echo %~fi""%~zi'

update @tmp set filename = substring(results,1,charindex('""',results)-1), filesize = substring(results,charindex('""',results)+2,10)

declare @filename varchar(255)
declare @tmp_cnt table (id int identity, filename varchar(255), rowcounter bigint, errors nvarchar(2048))

set @filename = (select top 1 filename from @tmp where processed is NULL and filesize > 0 order by id)

while isnull(@filename ,'') > ' .'
begin
update @tmp set processed = getdate() where filename = @filename
begin try
insert @tmp_cnt (filename , rowcounter)
exec('select '''+@filename+''' as filename,count(*) as rowcounter from OpenRowset(''MSDASQL'', ''Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=C:\ee; '',''select * from '+@filename+''')')
end try
begin catch
insert @tmp_cnt (filename, errors) values (@filename, error_message())
end catch
set @filename = isnull((select top 1 filename from @tmp where processed is NULL and filesize > 0 order by id),NULL)
end

select t.filename, t.filesize, t.processed, c.rowcounter, isnull(c.errors,'') as errors
from @tmp t
left outer join @tmp_cnt c on t.filename = c.filename

go

VicZ

Very cool solution, I'd like to throw you some more points, but I do not know how no that the solution is closed.

Anyway, believe or not, I am putting all this code in another loop to cycle thru different directories that I already captured in a table. so the "c:\ee" becomes the main directory and other subdirectories have the csv's.

Thank you Mark.
Mark Wills

If you fell so compelled, you could use the "Request Attention" button and a Moderator can help you out.

But, there is really no need, enjoyed helping you. Your thanks and knowing it works for you is reward enough :)
VicZ