Solved

Data missing when reading CSV using ADO.NET

Posted on 2008-10-10
10
1,339 Views
Last Modified: 2013-11-07
Hi Experts,

Below is my function that returns a data table containing the contents of a CSV file. As this CSV file isnt actually record based the field types are mixed. I have used IMEX=1 but data is still missing.

Can anyone tell me off the top of their head what I am doing wrong?

Thanks

Tom
Private Function GetDataADO(ByVal Path As String) As DataTable

	Dim Filename As String = IO.Path.GetFileName(Path)

	Dim FilePath As String = IO.Path.GetDirectoryName(Path)

	Dim ConnectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Extended Properties= ""Text;HDR=No;IMEX=1;"" ; Data Source=" & FilePath

	Dim Connection As New OleDb.OleDbConnection(ConnectionString)

	Connection.Open()

	Dim Command As New OleDb.OleDbCommand

	Command.CommandType = CommandType.Text

	Command.Connection = Connection

	Command.CommandText = "SELECT * FROM  [" & Filename & "]"

	Dim DataTable As New DataTable

	Dim DataAdapter As New OleDb.OleDbDataAdapter(Command)

	DataAdapter.Fill(DataTable)

	Connection.Close()

	Connection.Dispose()

	Return DataTable

End Function

Open in new window

0
Comment
Question by:TommyTwoPints
  • 5
  • 3
  • 2
10 Comments
 
LVL 1

Expert Comment

by:AWestEng
ID: 22694508
I'm not sure why your code dosen't work but here is another example how to read a CSV file, it maybe help you.
Public Function ReadCSVFile(ByVal strDataSource As String, _

           ByVal delimiter As String, _

           ByVal IncludeFieldName As Boolean, _

           Optional ByVal StartAtLine As Integer = 0, _

           Optional ByVal StopAtLine As Integer = 0) _

           As DataTable
 

    '*************************************************************************************

    ' strDataSource     : Filepath to the file thats going to be read

    '                   : Value: Sting

    ' StartAtLine       : The row that the reader is going to start at in the file

    '                   : Value:  1 or 2

    ' ShowData          : If a dataGrid exist can the file  be display in this 

    '                   : Value:  True or False

    ' delimiter         : The delimiter for the file reader

    '                   : Value: ";" or ","

    ' IncludeFieldName  : The first row are fieldnames include this or nor

    '                   : Value: True or False

    ' StopAtLine        : The row that the reader is going to end at if StopAtLine <> 0

    '*************************************************************************************
 

    Dim intColCntr As Integer 'Counter for number of columns in the text file. 

    Dim intRowCntr As Integer 'Counter row to be added. 
 

    'stopatline is 0 then read all rows 

    If StopAtLine = 0 Then

        StopAtLine = Int32.MaxValue

    End If
 

    Dim arrSQLCols() As String

    Dim myDataTable As New DataTable

    Dim myDataCol As New DataColumn

    Dim myDataRow As DataRow

    Dim strLine As String

    Dim sr As New StreamReader(strDataSource)

    Dim loopctr As Integer = 0
 

    Dim fi As New FileInfo(strDataSource)

    Dim filelength As Long = fi.Length

    Dim bytesread As Long = 0
 

    Try

        While sr.Peek <> -1

            strLine = sr.ReadLine
 

            'Search for qualifier character. 

            If strLine.Substring(0, 1) = """" Or strLine.Substring(0, 1) = "'" Then

                strLine = Strings.Replace(strLine, strLine.Substring(0, 1), "", 1, -1)

            End If
 

            Dim arrColName() As String = strLine.Split(delimiter)

            If loopctr = 0 Then

                'Check to include first row as field name. 

                If IncludeFieldName = True Then

                    'Add columns to the table. 

                    For intColCntr = 0 To arrColName.Length - 1

                        myDataCol = New DataColumn(arrColName(intColCntr))

                        myDataCol.AutoIncrement = True

                        myDataCol.DataType = Type.GetType("System.String")

                        myDataTable.Columns.Add(myDataCol)

                        arrSQLCols = arrColName

                    Next

                Else

                    For intColCntr = 0 To arrColName.Length - 1

                        myDataCol = New DataColumn("Field" & " " & intColCntr)

                        myDataCol.AutoIncrement = True

                        myDataCol.DataType = Type.GetType("System.String")

                        myDataTable.Columns.Add(myDataCol)

                    Next

                End If

            Else

                If loopctr >= StartAtLine AndAlso loopctr <= StopAtLine Then

                    'Add rows to the table. 

                    Dim arrRow() As String = strLine.Split(delimiter)

                    myDataRow = myDataTable.NewRow

                    For intRowCntr = 0 To arrColName.Length - 1

                        myDataRow(intRowCntr) = arrRow(intRowCntr)

                    Next

                    myDataTable.Rows.Add(myDataRow)

                End If

            End If

            loopctr += 1

        End While

        sr.Close()
 
 

        Return myDataTable
 

    Catch ex As Exception

        Throw

    Finally

        myDataTable = Nothing

        myDataCol = Nothing

        System.GC.Collect()

    End Try

End Function

Open in new window

0
 
LVL 1

Expert Comment

by:AWestEng
ID: 22694518
or maybe this
Public Function ReadCSVFileToDataTable(ByVal FilePath As String, ByVal ColumnNames() As String) As DataTable

    Dim myDataTable As New DataTable

    Dim r As New RegularExpressions.Regex(ControlChars.CrLf)

    Dim myDataCol As DataColumn
 

    If String.IsNullOrEmpty(FilePath) Then Return Nothing
 

    'Create the columns

    For i As Integer = 0 To ColumnNames.Length - 1

        myDataCol = New DataColumn(ColumnNames(i), GetType(String))

        myDataTable.Columns.Add(myDataCol)

    Next
 

    Dim lines() As String
 

    Try

        '// Create a new reader

        Dim sr As New StreamReader(FilePath, Encoding.Default, True)
 

        '// Read file

        Dim data As String = sr.ReadToEnd
 

        '// Close file

        sr.Close()
 

        '// Split up the data

        lines = r.Split(data)
 

    Catch ex As Exception

        Throw

    End Try
 

    Dim values() As String

    Dim row As DataRow
 

    For Each line As String In lines
 

        values = line.Split(";"c)

        row = myDataTable.NewRow
 

        Try

            If Not line = String.Empty Then

                For i As Integer = 0 To myDataTable.Columns.Count - 1

                    row(i) = values(i).ToString

                Next

                myDataTable.Rows.Add(row)

            End If

        Catch ex As Exception

            Throw

        End Try

    Next
 

    Return myDataTable

End Function

Open in new window

0
 
LVL 1

Expert Comment

by:AWestEng
ID: 22695675
Here you got a function for ADO to that I found
Private Sub cmdLoad_Click()

Dim Cn1 As ADODB.Connection

Dim Rs1 As ADODB.Recordset

Dim iSQLStr As String

Dim field_num As Integer
 

    Set Cn1 = New ADODB.Connection

    Cn1.ConnectionString = _

        "Driver={Microsoft Text Driver (*.txt; *.csv)};" & _

        "DefaultDir=" & txtDir.Text

    Cn1.Open
 

    lstResults.Visible = False

    DoEvents
 

    iSQLStr = "Select * FROM " & txtFile.Text & _

        " ORDER BY " & txtField.Text

    field_num = CInt(txtField.Text) - 1

    Set Rs1 = Cn1.Execute(iSQLStr)

    lstResults.Clear

    While Not Rs1.EOF

        If IsNull(Rs1.Fields(field_num).Value) Then

            lstResults.AddItem "<null>"

        Else

            lstResults.AddItem Rs1.Fields(field_num).Value

        End If

        Rs1.MoveNext

    Wend
 

    lstResults.Visible = True

End Sub

Open in new window

0
 
LVL 23

Expert Comment

by:Tiggerito
ID: 22696310
I have these notes related to this problem. Doing the registry change may fix things for you...

The problem is that the driver uses the first 8 rows to determine the type of a column.

Use the OLE/JET provided instead of the ODBC driver (even though both have the same problem)
 
Make sure the connection string has///
Extended Properties="Excel 8.0;IMEX=1;"

The IMEX=1 is the important bit for this bug fix. It makes it default to text over numbers

Then, alter the registry:

 HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.5\Engines\Excel
or on my machine it was
 HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel

set TypeGuessRows to 0 . this makes it check all rows for a type.

Note: setting these values in the connection string does not work so they have to be set in this way.

0
 
LVL 3

Author Comment

by:TommyTwoPints
ID: 22700451
AWestEng: The first 2 examples read text files from a stream; I already have a system like this but it uses REGEX. Performance is an issue on large files when done this way so I thought I would try using a DB connection. Your third example seems to use Microsoft® ActiveX® Data Objects. I think this is a com reference so i am reluctant to use it but I will give it a try.
Tiggerito: I dont want my application to change a users registry at runtime as this can only be done if the user account has rights to do so. Also when you try to use 'Excel 8.0' instead of 'Text' when reading a CSV an error occurs (External table is not in the expected format.).
 Thankyou both for your suggestions
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 
LVL 3

Author Comment

by:TommyTwoPints
ID: 22700558
AWestEng: The third sample you provided is using an ODBC connection that I have already tested in ADO.NET; It also doesnt read all the data correctly.
Maybe I'll just stick with my Regex method and sacrafice performance for accuracy.
0
 
LVL 23

Expert Comment

by:Tiggerito
ID: 22700628
I found this article an excelent source on writing a CSV parser.

http://www.boyet.com/Articles/CsvParser.html

With a little work you can create a consumer that parses directly into a DataTable.

This way would probably be the fastest and most reliable solution.
0
 
LVL 1

Accepted Solution

by:
AWestEng earned 500 total points
ID: 22700956
Here you have a really fast CSV reader.
http://www.codeproject.com/KB/database/CsvReader.aspx
If you have problmes with the C# code just let me know .
C# to VB.net translater
http://www.developerfusion.com/tools/convert/csharp-to-vb/
 
0
 
LVL 3

Author Comment

by:TommyTwoPints
ID: 22739164
Thankyou both for your help. I have decided to go with AWestEng's solution.
0
 
LVL 1

Expert Comment

by:AWestEng
ID: 22739272
Thx for the points :)
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

INTRODUCTION: While tying your database objects into builds and your enterprise source control system takes a third-party product (like Visual Studio Database Edition or Red-Gate's SQL Source Control), you can achieve some protection using a sing…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
It is a freely distributed piece of software for such tasks as photo retouching, image composition and image authoring. It works on many operating systems, in many languages.
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now