Solved

Read CSV file to array

Posted on 2004-03-29
10
88,934 Views
Last Modified: 2011-08-18
I know how to read a file.
But how to split the content? Beside being separated by comma, the actual data also contains comma. What should I do?
I need to store the contents into a multi-d array too.
0
Comment
Question by:yan_er04
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
10 Comments
 
LVL 48

Assisted Solution

by:Mikal613
Mikal613 earned 50 total points
ID: 10705438
Dim file_name As String
Dim fnum As Integer
Dim whole_file As String
Dim lines As Variant
Dim num_rows As Long
Dim the_array() As Variant
Dim R As Long
Dim C As Long

    file_name = App.Path
    If Right$(file_name, 1) <> "\" Then file_name = _
        file_name & "\"
    file_name = file_name & "test.csv"

    ' Load the file.
    fnum = FreeFile
    Open file_name For Input As fnum
    whole_file = Input$(LOF(fnum), #fnum)
    Close fnum

    ' Break the file into lines.
    lines = Split(whole_file, vbCrLf)

    ' Dimension the array.
    num_rows = UBound(lines)
    ReDim the_array(num_rows)

    ' Copy the data into the array.
    For R = 0 To num_rows
        the_array(R) = Split(lines(R), ",")
    Next R

    ' Prove we have the data loaded.
    For R = 0 To UBound(the_array)
        For C = 0 To UBound(the_array(R))
            Debug.Print the_array(R)(C) & "/";
        Next C
        Debug.Print
    Next R
    Debug.Print "===END==="
0
 
LVL 7

Expert Comment

by:dante469
ID: 10705632
The file is comma delimited and the data contains commas....

Comma delimitted is not a good option for you in that you will have to write a great deal of logic to determin true delimeters from data :(...

If possible could you delimit with different character...  Say a tilde ~ ...

That is the easiest answer and would simple use code similar to that provided by Mikal613 :)...

Good luck,
Dante
0
 
LVL 3

Expert Comment

by:gillit
ID: 10707418
// semi-pseudo-code to change your delimiter
bool inString = false;
for(int x=0;x<strFile.Length;x++)
  if(strFile[x]=='\"')  //Character that indicates that the comma is not a delimiter, normally double quote
    inString = !inString;
  else if(!inString && strFile[x]==',')
    strFile[x]='~'; //Some character that is not used in your file.

// Add code to split by new delimiter, in my example ~
0
Salesforce Made Easy to Use

On-screen guidance at the moment of need enables you & your employees to focus on the core, you can now boost your adoption rates swiftly and simply with one easy tool.

 
LVL 12

Expert Comment

by:farsight
ID: 10710508
Give us a sample line that illustrates this.
> The file is comma delimited and the data contains commas....
Also illustrate any other special characters in the data.  What if it contains a single-quote?  A double-quote?   etc.

A simple regular expression will parse it easily (if it's parseable).

Also note that much of Mikal613's code for this example is VB6-style, instead of VB.NET style.
You may prefer to translate it to use the newer methods.
0
 
LVL 12

Expert Comment

by:farsight
ID: 10710669
(I haven't tried this one, but it sounds promising.)
Parsing Comma-Separated Values or CSV files into a DataTable using Regular Expressions
http://www.hotblue.com/article0000.aspx?a=0006

--- --- ---

from:  http://www.3leaf.com/resources/articles/regex.aspx#Examples  [in C#]

Get Values of a CSV (updated by Arnold Bailey)
,(?=(?:[^"]*"[^"]*")*(?![^"]*"))
In conjunction with this code:

Regex r = new Regex(",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))");
string s = "\"a\",b,\"c, d, e\",,f";
string[] sAry = r.Split(s);
for(int i=0;i < sAry.Length;i++)
{
   Console.WriteLine(sAry[i]);
}

--- --- ---

from:  http://radio.weblogs.com/0117167/2003/02/18.html#a132  [in C#]

 CSV Regex
I haven't found a Regex that will extract just the values from a comma separated value file, but thanks to Joe Gulisano, I do have one that will find all the "real" commas:

,(?=([^"]*"[^"]*")*(?![^"]*"))
So getting the values is just:

Regex r = new Regex(",(?=([^\"]*"[^"]*")*(?![^"]*"))");string s = "\"a",b,"c, d, e",,f";int start = 0;foreach (Match m in r.Matches(s)){  Console.WriteLine(s.Substring(start, m.Index - start));  start = m.Index + 1;}Console.WriteLine(s.Substring(start, s.Length - start));
Outputs:

"a"b"c, d, e"
f
0
 

Author Comment

by:yan_er04
ID: 10711016
I can't change the format of the file.

A sample data will be
$100,"$2,350",a,15/04/1998

I am using VB.NET.

Is there no other way beside regular expression?

Well, I have tried regular expression before but... let's just say I am not an expert in it. =D

0
 

Author Comment

by:yan_er04
ID: 10711034
I notice all the examples are using 1-d array.
How to change that to 2-d?
I seems to have problems adding data to a 2-d array.
0
 
LVL 12

Accepted Solution

by:
farsight earned 150 total points
ID: 10713019
Here's some VB.NET based on one of those C# examples.
ParseCSV is a test driver.
ParseLine is the routine you'd actually want to use.

    Private Shared Sub ParseCSV()
        Dim oneLine As String = _
        """quoted"",nonquoted,""comma, """"inner-quoted-comma"""", chamelean"",,after-empty"

        Dim fieldValues As String() = ParseLine(oneLine)

        For i As Integer = 0 To fieldValues.Length - 1
            Console.WriteLine(i & "  [" & fieldValues(i) & "]")
        Next
    End Sub

    Private Shared Function ParseLine(ByVal oneLine As String) As String()
        ' Returns an array containing the values of the comma-separated fields.

        ' This pattern actually recognizes the correct commas.
        ' The Regex.Split() command later gets text between the commas.
        Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = _
                New System.Text.RegularExpressions.Regex(pattern)

        Return r.Split(oneLine)
    End Function

Correct output:
0  ["quoted"]
1  [nonquoted]
2  ["comma, ""inner-quoted-comma"", chamelean"]
3  []
4  [after-empty]

---

Here's some simple multidimensional array code.
http://www.bhupeshs.com/vbMultiarray.htm

Here's another post about multidimensional arrays -- already answered.
http://www.experts-exchange.com/Programming/Programming_Languages/Visual_Basic/Q_20744300.html

Here's some good stuff, especially near the bottom.
http://www.dotnet247.com/247reference/msgs/23/116904.aspx

Here's a sample program for two-dimensional arrays.
Near the bottom, there's a link to a very good PowerPoint presentation, too.  Chapter8.ppt
http://www.okc.cc.ok.us/aphilipp/WebCT/VB/09_VB_CourseContent.htm

0
 

Expert Comment

by:alex_developer
ID: 11603448
Odbc supports reading CSV:

string ConnectionString = @"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\";
OdbcConnection connection = new OdbcConnection(ConnectionString);
connection.Open();
OdbcCommand command = new OdbcCommand("Select * FROM Subscribers.csv", connection);
OdbcDataReader reader = command.ExecuteReader();
while (reader.Read())
{
      // loop through every field
      for (int x = 0; x < reader.FieldCount; x++)
      {
                // add code for puting fields into array....
            Console.WriteLine(reader.GetDataTypeName(x) + " - " + reader.GetValue(x));
      }
}

                  // Close the connection
                  connection.Close();
0
 
LVL 1

Expert Comment

by:bswiftly
ID: 13870822
         This code takes a comma delimited xml flie and puts it into a table. The regex is slightly more efficient I think..

            Dim regEx As Regex
            Dim cMatch As Text.RegularExpressions.MatchCollection
            Dim m As Text.RegularExpressions.Match

            cMatch = regEx.Matches(line, "(?<=(?:^|,\s*)"")(?:[^""]|"""")*|(?<=(?:^|,))(?:\s*)(?=(?:,|$))")

            For Each m In cMatch
               i += 1
               dc = New DataColumn
               dc.ColumnName = IIf(m.Value.Length = 0, colPrefix & i + 1, m.Value)
               dt.Columns.Add(dc)
            Next
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Recently while returning home from work my wife (another .NET developer) was murmuring something. On further poking she said that she has been assigned a task where she has to serialize and deserialize objects and she is afraid of serialization. Wha…
Welcome my friends to the second instalment and follow-up to our Minify and Concatenate Your Scripts and Stylesheets (http://www.experts-exchange.com/Programming/Languages/.NET/ASP.NET/A_4334-Minify-and-Concatenate-Your-Scripts-and-Stylesheets.html)…
I've attached the XLSM Excel spreadsheet I used in the video and also text files containing the macros used below. https://filedb.experts-exchange.com/incoming/2017/03_w12/1151775/Permutations.txt https://filedb.experts-exchange.com/incoming/201…

732 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question