Solved

Data missing when reading CSV using ADO.NET

Posted on 2008-10-10
10
1,353 Views
Last Modified: 2013-11-07
Hi Experts,

Below is my function that returns a data table containing the contents of a CSV file. As this CSV file isnt actually record based the field types are mixed. I have used IMEX=1 but data is still missing.

Can anyone tell me off the top of their head what I am doing wrong?

Thanks

Tom
Private Function GetDataADO(ByVal Path As String) As DataTable
	Dim Filename As String = IO.Path.GetFileName(Path)
	Dim FilePath As String = IO.Path.GetDirectoryName(Path)
	Dim ConnectionString As String = "Provider=Microsoft.Jet.OLEDB.4.0;Extended Properties= ""Text;HDR=No;IMEX=1;"" ; Data Source=" & FilePath
	Dim Connection As New OleDb.OleDbConnection(ConnectionString)
	Connection.Open()
	Dim Command As New OleDb.OleDbCommand
	Command.CommandType = CommandType.Text
	Command.Connection = Connection
	Command.CommandText = "SELECT * FROM  [" & Filename & "]"
	Dim DataTable As New DataTable
	Dim DataAdapter As New OleDb.OleDbDataAdapter(Command)
	DataAdapter.Fill(DataTable)
	Connection.Close()
	Connection.Dispose()
	Return DataTable
End Function

Open in new window

0
Comment
Question by:TommyTwoPints
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
10 Comments
 
LVL 1

Expert Comment

by:AWestEng
ID: 22694508
I'm not sure why your code dosen't work but here is another example how to read a CSV file, it maybe help you.
Public Function ReadCSVFile(ByVal strDataSource As String, _
           ByVal delimiter As String, _
           ByVal IncludeFieldName As Boolean, _
           Optional ByVal StartAtLine As Integer = 0, _
           Optional ByVal StopAtLine As Integer = 0) _
           As DataTable
 
    '*************************************************************************************
    ' strDataSource     : Filepath to the file thats going to be read
    '                   : Value: Sting
    ' StartAtLine       : The row that the reader is going to start at in the file
    '                   : Value:  1 or 2
    ' ShowData          : If a dataGrid exist can the file  be display in this 
    '                   : Value:  True or False
    ' delimiter         : The delimiter for the file reader
    '                   : Value: ";" or ","
    ' IncludeFieldName  : The first row are fieldnames include this or nor
    '                   : Value: True or False
    ' StopAtLine        : The row that the reader is going to end at if StopAtLine <> 0
    '*************************************************************************************
 
    Dim intColCntr As Integer 'Counter for number of columns in the text file. 
    Dim intRowCntr As Integer 'Counter row to be added. 
 
    'stopatline is 0 then read all rows 
    If StopAtLine = 0 Then
        StopAtLine = Int32.MaxValue
    End If
 
    Dim arrSQLCols() As String
    Dim myDataTable As New DataTable
    Dim myDataCol As New DataColumn
    Dim myDataRow As DataRow
    Dim strLine As String
    Dim sr As New StreamReader(strDataSource)
    Dim loopctr As Integer = 0
 
    Dim fi As New FileInfo(strDataSource)
    Dim filelength As Long = fi.Length
    Dim bytesread As Long = 0
 
    Try
        While sr.Peek <> -1
            strLine = sr.ReadLine
 
            'Search for qualifier character. 
            If strLine.Substring(0, 1) = """" Or strLine.Substring(0, 1) = "'" Then
                strLine = Strings.Replace(strLine, strLine.Substring(0, 1), "", 1, -1)
            End If
 
            Dim arrColName() As String = strLine.Split(delimiter)
            If loopctr = 0 Then
                'Check to include first row as field name. 
                If IncludeFieldName = True Then
                    'Add columns to the table. 
                    For intColCntr = 0 To arrColName.Length - 1
                        myDataCol = New DataColumn(arrColName(intColCntr))
                        myDataCol.AutoIncrement = True
                        myDataCol.DataType = Type.GetType("System.String")
                        myDataTable.Columns.Add(myDataCol)
                        arrSQLCols = arrColName
                    Next
                Else
                    For intColCntr = 0 To arrColName.Length - 1
                        myDataCol = New DataColumn("Field" & " " & intColCntr)
                        myDataCol.AutoIncrement = True
                        myDataCol.DataType = Type.GetType("System.String")
                        myDataTable.Columns.Add(myDataCol)
                    Next
                End If
            Else
                If loopctr >= StartAtLine AndAlso loopctr <= StopAtLine Then
                    'Add rows to the table. 
                    Dim arrRow() As String = strLine.Split(delimiter)
                    myDataRow = myDataTable.NewRow
                    For intRowCntr = 0 To arrColName.Length - 1
                        myDataRow(intRowCntr) = arrRow(intRowCntr)
                    Next
                    myDataTable.Rows.Add(myDataRow)
                End If
            End If
            loopctr += 1
        End While
        sr.Close()
 
 
        Return myDataTable
 
    Catch ex As Exception
        Throw
    Finally
        myDataTable = Nothing
        myDataCol = Nothing
        System.GC.Collect()
    End Try
End Function

Open in new window

0
 
LVL 1

Expert Comment

by:AWestEng
ID: 22694518
or maybe this
Public Function ReadCSVFileToDataTable(ByVal FilePath As String, ByVal ColumnNames() As String) As DataTable
    Dim myDataTable As New DataTable
    Dim r As New RegularExpressions.Regex(ControlChars.CrLf)
    Dim myDataCol As DataColumn
 
    If String.IsNullOrEmpty(FilePath) Then Return Nothing
 
    'Create the columns
    For i As Integer = 0 To ColumnNames.Length - 1
        myDataCol = New DataColumn(ColumnNames(i), GetType(String))
        myDataTable.Columns.Add(myDataCol)
    Next
 
    Dim lines() As String
 
    Try
        '// Create a new reader
        Dim sr As New StreamReader(FilePath, Encoding.Default, True)
 
        '// Read file
        Dim data As String = sr.ReadToEnd
 
        '// Close file
        sr.Close()
 
        '// Split up the data
        lines = r.Split(data)
 
    Catch ex As Exception
        Throw
    End Try
 
    Dim values() As String
    Dim row As DataRow
 
    For Each line As String In lines
 
        values = line.Split(";"c)
        row = myDataTable.NewRow
 
        Try
            If Not line = String.Empty Then
                For i As Integer = 0 To myDataTable.Columns.Count - 1
                    row(i) = values(i).ToString
                Next
                myDataTable.Rows.Add(row)
            End If
        Catch ex As Exception
            Throw
        End Try
    Next
 
    Return myDataTable
End Function

Open in new window

0
 
LVL 1

Expert Comment

by:AWestEng
ID: 22695675
Here you got a function for ADO to that I found
Private Sub cmdLoad_Click()
Dim Cn1 As ADODB.Connection
Dim Rs1 As ADODB.Recordset
Dim iSQLStr As String
Dim field_num As Integer
 
    Set Cn1 = New ADODB.Connection
    Cn1.ConnectionString = _
        "Driver={Microsoft Text Driver (*.txt; *.csv)};" & _
        "DefaultDir=" & txtDir.Text
    Cn1.Open
 
    lstResults.Visible = False
    DoEvents
 
    iSQLStr = "Select * FROM " & txtFile.Text & _
        " ORDER BY " & txtField.Text
    field_num = CInt(txtField.Text) - 1
    Set Rs1 = Cn1.Execute(iSQLStr)
    lstResults.Clear
    While Not Rs1.EOF
        If IsNull(Rs1.Fields(field_num).Value) Then
            lstResults.AddItem "<null>"
        Else
            lstResults.AddItem Rs1.Fields(field_num).Value
        End If
        Rs1.MoveNext
    Wend
 
    lstResults.Visible = True
End Sub

Open in new window

0
Raise the IQ of Your IT Alerts

From IT major incidents to manufacturing line slowdowns, every business process generates insights that need to reach the people required to take action. You need a platform that integrates with your business tools to create fully enabled DevOps toolchains.

You need xMatters.

 
LVL 23

Expert Comment

by:Tony McCreath
ID: 22696310
I have these notes related to this problem. Doing the registry change may fix things for you...

The problem is that the driver uses the first 8 rows to determine the type of a column.

Use the OLE/JET provided instead of the ODBC driver (even though both have the same problem)
 
Make sure the connection string has///
Extended Properties="Excel 8.0;IMEX=1;"

The IMEX=1 is the important bit for this bug fix. It makes it default to text over numbers

Then, alter the registry:

 HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.5\Engines\Excel
or on my machine it was
 HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel

set TypeGuessRows to 0 . this makes it check all rows for a type.

Note: setting these values in the connection string does not work so they have to be set in this way.

0
 
LVL 3

Author Comment

by:TommyTwoPints
ID: 22700451
AWestEng: The first 2 examples read text files from a stream; I already have a system like this but it uses REGEX. Performance is an issue on large files when done this way so I thought I would try using a DB connection. Your third example seems to use Microsoft® ActiveX® Data Objects. I think this is a com reference so i am reluctant to use it but I will give it a try.
Tiggerito: I dont want my application to change a users registry at runtime as this can only be done if the user account has rights to do so. Also when you try to use 'Excel 8.0' instead of 'Text' when reading a CSV an error occurs (External table is not in the expected format.).
 Thankyou both for your suggestions
0
 
LVL 3

Author Comment

by:TommyTwoPints
ID: 22700558
AWestEng: The third sample you provided is using an ODBC connection that I have already tested in ADO.NET; It also doesnt read all the data correctly.
Maybe I'll just stick with my Regex method and sacrafice performance for accuracy.
0
 
LVL 23

Expert Comment

by:Tony McCreath
ID: 22700628
I found this article an excelent source on writing a CSV parser.

http://www.boyet.com/Articles/CsvParser.html

With a little work you can create a consumer that parses directly into a DataTable.

This way would probably be the fastest and most reliable solution.
0
 
LVL 1

Accepted Solution

by:
AWestEng earned 500 total points
ID: 22700956
Here you have a really fast CSV reader.
http://www.codeproject.com/KB/database/CsvReader.aspx
If you have problmes with the C# code just let me know .
C# to VB.net translater
http://www.developerfusion.com/tools/convert/csharp-to-vb/
 
0
 
LVL 3

Author Comment

by:TommyTwoPints
ID: 22739164
Thankyou both for your help. I have decided to go with AWestEng's solution.
0
 
LVL 1

Expert Comment

by:AWestEng
ID: 22739272
Thx for the points :)
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Today I had a very interesting conundrum that had to get solved quickly. Needless to say, it wasn't resolved quickly because when we needed it we were very rushed, but as soon as the conference call was over and I took a step back I saw the correct …
Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
This video Micro Tutorial shows how to password-protect PDF files with free software. Many software products can do this, such as Adobe Acrobat (but not Adobe Reader), Nuance PaperPort, and Nuance Power PDF, but they are not free products. This vide…
If you're a developer or IT admin, you’re probably tasked with managing multiple websites, servers, applications, and levels of security on a daily basis. While this can be extremely time consuming, it can also be frustrating when systems aren't wor…

691 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question