Solved

Read CSV file to array

Posted on 2004-03-29
10
88,991 Views
Last Modified: 2011-08-18
I know how to read a file.
But how to split the content? Beside being separated by comma, the actual data also contains comma. What should I do?
I need to store the contents into a multi-d array too.
0
Comment
Question by:yan_er04
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
10 Comments
 
LVL 48

Assisted Solution

by:Mikal613
Mikal613 earned 50 total points
ID: 10705438
Dim file_name As String
Dim fnum As Integer
Dim whole_file As String
Dim lines As Variant
Dim num_rows As Long
Dim the_array() As Variant
Dim R As Long
Dim C As Long

    file_name = App.Path
    If Right$(file_name, 1) <> "\" Then file_name = _
        file_name & "\"
    file_name = file_name & "test.csv"

    ' Load the file.
    fnum = FreeFile
    Open file_name For Input As fnum
    whole_file = Input$(LOF(fnum), #fnum)
    Close fnum

    ' Break the file into lines.
    lines = Split(whole_file, vbCrLf)

    ' Dimension the array.
    num_rows = UBound(lines)
    ReDim the_array(num_rows)

    ' Copy the data into the array.
    For R = 0 To num_rows
        the_array(R) = Split(lines(R), ",")
    Next R

    ' Prove we have the data loaded.
    For R = 0 To UBound(the_array)
        For C = 0 To UBound(the_array(R))
            Debug.Print the_array(R)(C) & "/";
        Next C
        Debug.Print
    Next R
    Debug.Print "===END==="
0
 
LVL 7

Expert Comment

by:dante469
ID: 10705632
The file is comma delimited and the data contains commas....

Comma delimitted is not a good option for you in that you will have to write a great deal of logic to determin true delimeters from data :(...

If possible could you delimit with different character...  Say a tilde ~ ...

That is the easiest answer and would simple use code similar to that provided by Mikal613 :)...

Good luck,
Dante
0
 
LVL 3

Expert Comment

by:gillit
ID: 10707418
// semi-pseudo-code to change your delimiter
bool inString = false;
for(int x=0;x<strFile.Length;x++)
  if(strFile[x]=='\"')  //Character that indicates that the comma is not a delimiter, normally double quote
    inString = !inString;
  else if(!inString && strFile[x]==',')
    strFile[x]='~'; //Some character that is not used in your file.

// Add code to split by new delimiter, in my example ~
0
How to Create Failover DNS Record Sets in Route 53

Route 53 has the ability to easily configure DNS record sets specifically for failover scenarios. These failover record sets can be configured to failover to full-blown deployments in other regions or to a static HTML page that informs your customers of the issue.

 
LVL 12

Expert Comment

by:farsight
ID: 10710508
Give us a sample line that illustrates this.
> The file is comma delimited and the data contains commas....
Also illustrate any other special characters in the data.  What if it contains a single-quote?  A double-quote?   etc.

A simple regular expression will parse it easily (if it's parseable).

Also note that much of Mikal613's code for this example is VB6-style, instead of VB.NET style.
You may prefer to translate it to use the newer methods.
0
 
LVL 12

Expert Comment

by:farsight
ID: 10710669
(I haven't tried this one, but it sounds promising.)
Parsing Comma-Separated Values or CSV files into a DataTable using Regular Expressions
http://www.hotblue.com/article0000.aspx?a=0006

--- --- ---

from:  http://www.3leaf.com/resources/articles/regex.aspx#Examples  [in C#]

Get Values of a CSV (updated by Arnold Bailey)
,(?=(?:[^"]*"[^"]*")*(?![^"]*"))
In conjunction with this code:

Regex r = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
string s = "\"a\",b,\"c, d, e\",,f";
string[] sAry = r.Split(s);
for(int i=0;i < sAry.Length;i++)
{
   Console.WriteLine(sAry[i]);
}

--- --- ---

from:  http://radio.weblogs.com/0117167/2003/02/18.html#a132  [in C#]

 CSV Regex
I haven't found a Regex that will extract just the values from a comma separated value file, but thanks to Joe Gulisano, I do have one that will find all the "real" commas:

,(?=([^"]*"[^"]*")*(?![^"]*"))
So getting the values is just:

Regex r = new Regex(",(?=([^\"]*"[^"]*")*(?![^"]*"))");string s = "\"a",b,"c, d, e",,f";int start = 0;foreach (Match m in r.Matches(s)){  Console.WriteLine(s.Substring(start, m.Index - start));  start = m.Index + 1;}Console.WriteLine(s.Substring(start, s.Length - start));
Outputs:

"a"b"c, d, e"
f
0
 

Author Comment

by:yan_er04
ID: 10711016
I can't change the format of the file.

A sample data will be
$100,"$2,350",a,15/04/1998

I am using VB.NET.

Is there no other way beside regular expression?

Well, I have tried regular expression before but... let's just say I am not an expert in it. =D

0
 

Author Comment

by:yan_er04
ID: 10711034
I notice all the examples are using 1-d array.
How to change that to 2-d?
I seems to have problems adding data to a 2-d array.
0
 
LVL 12

Accepted Solution

by:
farsight earned 150 total points
ID: 10713019
Here's some VB.NET based on one of those C# examples.
ParseCSV is a test driver.
ParseLine is the routine you'd actually want to use.

    Private Shared Sub ParseCSV()
        Dim oneLine As String = _
        """quoted"",nonquoted,""comma, """"inner-quoted-comma"""", chamelean"",,after-empty"

        Dim fieldValues As String() = ParseLine(oneLine)

        For i As Integer = 0 To fieldValues.Length - 1
            Console.WriteLine(i & "  [" & fieldValues(i) & "]")
        Next
    End Sub

    Private Shared Function ParseLine(ByVal oneLine As String) As String()
        ' Returns an array containing the values of the comma-separated fields.

        ' This pattern actually recognizes the correct commas.
        ' The Regex.Split() command later gets text between the commas.
        Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = _
                New System.Text.RegularExpressions.Regex(pattern)

        Return r.Split(oneLine)
    End Function

Correct output:
0  ["quoted"]
1  [nonquoted]
2  ["comma, ""inner-quoted-comma"", chamelean"]
3  []
4  [after-empty]

---

Here's some simple multidimensional array code.
http://www.bhupeshs.com/vbMultiarray.htm

Here's another post about multidimensional arrays -- already answered.
http://www.experts-exchange.com/Programming/Programming_Languages/Visual_Basic/Q_20744300.html

Here's some good stuff, especially near the bottom.
http://www.dotnet247.com/247reference/msgs/23/116904.aspx

Here's a sample program for two-dimensional arrays.
Near the bottom, there's a link to a very good PowerPoint presentation, too.  Chapter8.ppt
http://www.okc.cc.ok.us/aphilipp/WebCT/VB/09_VB_CourseContent.htm

0
 

Expert Comment

by:alex_developer
ID: 11603448
Odbc supports reading CSV:

string ConnectionString = @"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\";
OdbcConnection connection = new OdbcConnection(ConnectionString);
connection.Open();
OdbcCommand command = new OdbcCommand("Select * FROM Subscribers.csv", connection);
OdbcDataReader reader = command.ExecuteReader();
while (reader.Read())
{
      // loop through every field
      for (int x = 0; x < reader.FieldCount; x++)
      {
                // add code for puting fields into array....
            Console.WriteLine(reader.GetDataTypeName(x) + " - " + reader.GetValue(x));
      }
}

                  // Close the connection
                  connection.Close();
0
 
LVL 1

Expert Comment

by:bswiftly
ID: 13870822
         This code takes a comma delimited xml flie and puts it into a table. The regex is slightly more efficient I think..

            Dim regEx As Regex
            Dim cMatch As Text.RegularExpressions.MatchCollection
            Dim m As Text.RegularExpressions.Match

            cMatch = regEx.Matches(line, "(?<=(?:^|,\s*)"")(?:[^""]|"""")*|(?<=(?:^|,))(?:\s*)(?=(?:,|$))")

            For Each m In cMatch
               i += 1
               dc = New DataColumn
               dc.ColumnName = IIf(m.Value.Length = 0, colPrefix & i + 1, m.Value)
               dt.Columns.Add(dc)
            Next
0

Featured Post

How to Create Failover DNS Record Sets in Route 53

Route 53 has the ability to easily configure DNS record sets specifically for failover scenarios. These failover record sets can be configured to failover to full-blown deployments in other regions or to a static HTML page that informs your customers of the issue.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Summary Displaying images in RichTextBox is a common requirement with limited solutions available. Pasting through clipboard or embedding into RTF content only support static images.  This article describes how to insert Windows control objects int…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
NetCrunch network monitor is a highly extensive platform for network monitoring and alert generation. In this video you'll see a live demo of NetCrunch with most notable features explained in a walk-through manner. You'll also get to know the philos…
Add bar graphs to Access queries using Unicode block characters. Graphs appear on every record in the color you want. Give life to numbers. Hopes this gives you ideas on visualizing your data in new ways ~ Create a calculated field in a query: …

630 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question