Solved

Read CSV file to array

Posted on 2004-03-29
10
88,801 Views
Last Modified: 2011-08-18
I know how to read a file.
But how to split the content? Beside being separated by comma, the actual data also contains comma. What should I do?
I need to store the contents into a multi-d array too.
0
Comment
Question by:yan_er04
10 Comments
 
LVL 48

Assisted Solution

by:Mikal613
Mikal613 earned 50 total points
ID: 10705438
Dim file_name As String
Dim fnum As Integer
Dim whole_file As String
Dim lines As Variant
Dim num_rows As Long
Dim the_array() As Variant
Dim R As Long
Dim C As Long

    file_name = App.Path
    If Right$(file_name, 1) <> "\" Then file_name = _
        file_name & "\"
    file_name = file_name & "test.csv"

    ' Load the file.
    fnum = FreeFile
    Open file_name For Input As fnum
    whole_file = Input$(LOF(fnum), #fnum)
    Close fnum

    ' Break the file into lines.
    lines = Split(whole_file, vbCrLf)

    ' Dimension the array.
    num_rows = UBound(lines)
    ReDim the_array(num_rows)

    ' Copy the data into the array.
    For R = 0 To num_rows
        the_array(R) = Split(lines(R), ",")
    Next R

    ' Prove we have the data loaded.
    For R = 0 To UBound(the_array)
        For C = 0 To UBound(the_array(R))
            Debug.Print the_array(R)(C) & "/";
        Next C
        Debug.Print
    Next R
    Debug.Print "===END==="
0
 
LVL 7

Expert Comment

by:dante469
ID: 10705632
The file is comma delimited and the data contains commas....

Comma delimitted is not a good option for you in that you will have to write a great deal of logic to determin true delimeters from data :(...

If possible could you delimit with different character...  Say a tilde ~ ...

That is the easiest answer and would simple use code similar to that provided by Mikal613 :)...

Good luck,
Dante
0
 
LVL 3

Expert Comment

by:gillit
ID: 10707418
// semi-pseudo-code to change your delimiter
bool inString = false;
for(int x=0;x<strFile.Length;x++)
  if(strFile[x]=='\"')  //Character that indicates that the comma is not a delimiter, normally double quote
    inString = !inString;
  else if(!inString && strFile[x]==',')
    strFile[x]='~'; //Some character that is not used in your file.

// Add code to split by new delimiter, in my example ~
0
 
LVL 12

Expert Comment

by:farsight
ID: 10710508
Give us a sample line that illustrates this.
> The file is comma delimited and the data contains commas....
Also illustrate any other special characters in the data.  What if it contains a single-quote?  A double-quote?   etc.

A simple regular expression will parse it easily (if it's parseable).

Also note that much of Mikal613's code for this example is VB6-style, instead of VB.NET style.
You may prefer to translate it to use the newer methods.
0
 
LVL 12

Expert Comment

by:farsight
ID: 10710669
(I haven't tried this one, but it sounds promising.)
Parsing Comma-Separated Values or CSV files into a DataTable using Regular Expressions
http://www.hotblue.com/article0000.aspx?a=0006

--- --- ---

from:  http://www.3leaf.com/resources/articles/regex.aspx#Examples  [in C#]

Get Values of a CSV (updated by Arnold Bailey)
,(?=(?:[^"]*"[^"]*")*(?![^"]*"))
In conjunction with this code:

Regex r = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
string s = "\"a\",b,\"c, d, e\",,f";
string[] sAry = r.Split(s);
for(int i=0;i < sAry.Length;i++)
{
   Console.WriteLine(sAry[i]);
}

--- --- ---

from:  http://radio.weblogs.com/0117167/2003/02/18.html#a132  [in C#]

 CSV Regex
I haven't found a Regex that will extract just the values from a comma separated value file, but thanks to Joe Gulisano, I do have one that will find all the "real" commas:

,(?=([^"]*"[^"]*")*(?![^"]*"))
So getting the values is just:

Regex r = new Regex(",(?=([^\"]*"[^"]*")*(?![^"]*"))");string s = "\"a",b,"c, d, e",,f";int start = 0;foreach (Match m in r.Matches(s)){  Console.WriteLine(s.Substring(start, m.Index - start));  start = m.Index + 1;}Console.WriteLine(s.Substring(start, s.Length - start));
Outputs:

"a"b"c, d, e"
f
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 

Author Comment

by:yan_er04
ID: 10711016
I can't change the format of the file.

A sample data will be
$100,"$2,350",a,15/04/1998

I am using VB.NET.

Is there no other way beside regular expression?

Well, I have tried regular expression before but... let's just say I am not an expert in it. =D

0
 

Author Comment

by:yan_er04
ID: 10711034
I notice all the examples are using 1-d array.
How to change that to 2-d?
I seems to have problems adding data to a 2-d array.
0
 
LVL 12

Accepted Solution

by:
farsight earned 150 total points
ID: 10713019
Here's some VB.NET based on one of those C# examples.
ParseCSV is a test driver.
ParseLine is the routine you'd actually want to use.

    Private Shared Sub ParseCSV()
        Dim oneLine As String = _
        """quoted"",nonquoted,""comma, """"inner-quoted-comma"""", chamelean"",,after-empty"

        Dim fieldValues As String() = ParseLine(oneLine)

        For i As Integer = 0 To fieldValues.Length - 1
            Console.WriteLine(i & "  [" & fieldValues(i) & "]")
        Next
    End Sub

    Private Shared Function ParseLine(ByVal oneLine As String) As String()
        ' Returns an array containing the values of the comma-separated fields.

        ' This pattern actually recognizes the correct commas.
        ' The Regex.Split() command later gets text between the commas.
        Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = _
                New System.Text.RegularExpressions.Regex(pattern)

        Return r.Split(oneLine)
    End Function

Correct output:
0  ["quoted"]
1  [nonquoted]
2  ["comma, ""inner-quoted-comma"", chamelean"]
3  []
4  [after-empty]

---

Here's some simple multidimensional array code.
http://www.bhupeshs.com/vbMultiarray.htm

Here's another post about multidimensional arrays -- already answered.
http://www.experts-exchange.com/Programming/Programming_Languages/Visual_Basic/Q_20744300.html

Here's some good stuff, especially near the bottom.
http://www.dotnet247.com/247reference/msgs/23/116904.aspx

Here's a sample program for two-dimensional arrays.
Near the bottom, there's a link to a very good PowerPoint presentation, too.  Chapter8.ppt
http://www.okc.cc.ok.us/aphilipp/WebCT/VB/09_VB_CourseContent.htm

0
 

Expert Comment

by:alex_developer
ID: 11603448
Odbc supports reading CSV:

string ConnectionString = @"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\";
OdbcConnection connection = new OdbcConnection(ConnectionString);
connection.Open();
OdbcCommand command = new OdbcCommand("Select * FROM Subscribers.csv", connection);
OdbcDataReader reader = command.ExecuteReader();
while (reader.Read())
{
      // loop through every field
      for (int x = 0; x < reader.FieldCount; x++)
      {
                // add code for puting fields into array....
            Console.WriteLine(reader.GetDataTypeName(x) + " - " + reader.GetValue(x));
      }
}

                  // Close the connection
                  connection.Close();
0
 
LVL 1

Expert Comment

by:bswiftly
ID: 13870822
         This code takes a comma delimited xml flie and puts it into a table. The regex is slightly more efficient I think..

            Dim regEx As Regex
            Dim cMatch As Text.RegularExpressions.MatchCollection
            Dim m As Text.RegularExpressions.Match

            cMatch = regEx.Matches(line, "(?<=(?:^|,\s*)"")(?:[^""]|"""")*|(?<=(?:^|,))(?:\s*)(?=(?:,|$))")

            For Each m In cMatch
               i += 1
               dc = New DataColumn
               dc.ColumnName = IIf(m.Value.Length = 0, colPrefix & i + 1, m.Value)
               dt.Columns.Add(dc)
            Next
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

A basic question.. “What is the Garbage Collector?” The usual answer given back: “Garbage collector is a background thread run by the CLR for freeing up the memory space used by the objects which are no longer used by the program.” I wondered …
Welcome my friends to the second instalment and follow-up to our Minify and Concatenate Your Scripts and Stylesheets (http://www.experts-exchange.com/Programming/Languages/.NET/ASP.NET/A_4334-Minify-and-Concatenate-Your-Scripts-and-Stylesheets.html)…
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now