yan_er04
asked on
Read CSV file to array
I know how to read a file.
But how to split the content? Beside being separated by comma, the actual data also contains comma. What should I do?
I need to store the contents into a multi-d array too.
But how to split the content? Beside being separated by comma, the actual data also contains comma. What should I do?
I need to store the contents into a multi-d array too.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
// semi-pseudo-code to change your delimiter
bool inString = false;
for(int x=0;x<strFile.Length;x++)
if(strFile[x]=='\"') //Character that indicates that the comma is not a delimiter, normally double quote
inString = !inString;
else if(!inString && strFile[x]==',')
strFile[x]='~'; //Some character that is not used in your file.
// Add code to split by new delimiter, in my example ~
bool inString = false;
for(int x=0;x<strFile.Length;x++)
if(strFile[x]=='\"') //Character that indicates that the comma is not a delimiter, normally double quote
inString = !inString;
else if(!inString && strFile[x]==',')
strFile[x]='~'; //Some character that is not used in your file.
// Add code to split by new delimiter, in my example ~
Give us a sample line that illustrates this.
> The file is comma delimited and the data contains commas....
Also illustrate any other special characters in the data. What if it contains a single-quote? A double-quote? etc.
A simple regular expression will parse it easily (if it's parseable).
Also note that much of Mikal613's code for this example is VB6-style, instead of VB.NET style.
You may prefer to translate it to use the newer methods.
> The file is comma delimited and the data contains commas....
Also illustrate any other special characters in the data. What if it contains a single-quote? A double-quote? etc.
A simple regular expression will parse it easily (if it's parseable).
Also note that much of Mikal613's code for this example is VB6-style, instead of VB.NET style.
You may prefer to translate it to use the newer methods.
(I haven't tried this one, but it sounds promising.)
Parsing Comma-Separated Values or CSV files into a DataTable using Regular Expressions
http://www.hotblue.com/article0000.aspx?a=0006
--- --- ---
from: http://www.3leaf.com/resources/articles/regex.aspx#Examples [in C#]
Get Values of a CSV (updated by Arnold Bailey)
,(?=(?:[^"]*"[^"]*")*(?![^ "]*"))
In conjunction with this code:
Regex r = new Regex(",(?=(?:[^\"]*\"[^\" ]*\")*(?![ ^\"]*\"))" );
string s = "\"a\",b,\"c, d, e\",,f";
string[] sAry = r.Split(s);
for(int i=0;i < sAry.Length;i++)
{
Console.WriteLine(sAry[i]) ;
}
--- --- ---
from: http://radio.weblogs.com/0117167/2003/02/18.html#a132 [in C#]
CSV Regex
I haven't found a Regex that will extract just the values from a comma separated value file, but thanks to Joe Gulisano, I do have one that will find all the "real" commas:
,(?=([^"]*"[^"]*")*(?![^"] *"))
So getting the values is just:
Regex r = new Regex(",(?=([^\"]*"[^"]*") *(?![^"]*" ))");strin g s = "\"a",b,"c, d, e",,f";int start = 0;foreach (Match m in r.Matches(s)){ Console.WriteLine(s.Substr ing(start, m.Index - start)); start = m.Index + 1;}Console.WriteLine(s.Sub string(sta rt, s.Length - start));
Outputs:
"a"b"c, d, e"
f
Parsing Comma-Separated Values or CSV files into a DataTable using Regular Expressions
http://www.hotblue.com/article0000.aspx?a=0006
--- --- ---
from: http://www.3leaf.com/resources/articles/regex.aspx#Examples [in C#]
Get Values of a CSV (updated by Arnold Bailey)
,(?=(?:[^"]*"[^"]*")*(?![^
In conjunction with this code:
Regex r = new Regex(",(?=(?:[^\"]*\"[^\"
string s = "\"a\",b,\"c, d, e\",,f";
string[] sAry = r.Split(s);
for(int i=0;i < sAry.Length;i++)
{
Console.WriteLine(sAry[i])
}
--- --- ---
from: http://radio.weblogs.com/0117167/2003/02/18.html#a132 [in C#]
CSV Regex
I haven't found a Regex that will extract just the values from a comma separated value file, but thanks to Joe Gulisano, I do have one that will find all the "real" commas:
,(?=([^"]*"[^"]*")*(?![^"]
So getting the values is just:
Regex r = new Regex(",(?=([^\"]*"[^"]*")
Outputs:
"a"b"c, d, e"
f
ASKER
I can't change the format of the file.
A sample data will be
$100,"$2,350",a,15/04/1998
I am using VB.NET.
Is there no other way beside regular expression?
Well, I have tried regular expression before but... let's just say I am not an expert in it. =D
A sample data will be
$100,"$2,350",a,15/04/1998
I am using VB.NET.
Is there no other way beside regular expression?
Well, I have tried regular expression before but... let's just say I am not an expert in it. =D
ASKER
I notice all the examples are using 1-d array.
How to change that to 2-d?
I seems to have problems adding data to a 2-d array.
How to change that to 2-d?
I seems to have problems adding data to a 2-d array.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Odbc supports reading CSV:
string ConnectionString = @"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\";
OdbcConnection connection = new OdbcConnection(ConnectionS tring);
connection.Open();
OdbcCommand command = new OdbcCommand("Select * FROM Subscribers.csv", connection);
OdbcDataReader reader = command.ExecuteReader();
while (reader.Read())
{
// loop through every field
for (int x = 0; x < reader.FieldCount; x++)
{
// add code for puting fields into array....
Console.WriteLine(reader.G etDataType Name(x) + " - " + reader.GetValue(x));
}
}
// Close the connection
connection.Close();
string ConnectionString = @"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\";
OdbcConnection connection = new OdbcConnection(ConnectionS
connection.Open();
OdbcCommand command = new OdbcCommand("Select * FROM Subscribers.csv", connection);
OdbcDataReader reader = command.ExecuteReader();
while (reader.Read())
{
// loop through every field
for (int x = 0; x < reader.FieldCount; x++)
{
// add code for puting fields into array....
Console.WriteLine(reader.G
}
}
// Close the connection
connection.Close();
This code takes a comma delimited xml flie and puts it into a table. The regex is slightly more efficient I think..
Dim regEx As Regex
Dim cMatch As Text.RegularExpressions.Ma tchCollect ion
Dim m As Text.RegularExpressions.Ma tch
cMatch = regEx.Matches(line, "(?<=(?:^|,\s*)"")(?:[^""] |"""")*|(? <=(?:^|,)) (?:\s*)(?= (?:,|$))")
For Each m In cMatch
i += 1
dc = New DataColumn
dc.ColumnName = IIf(m.Value.Length = 0, colPrefix & i + 1, m.Value)
dt.Columns.Add(dc)
Next
Dim regEx As Regex
Dim cMatch As Text.RegularExpressions.Ma
Dim m As Text.RegularExpressions.Ma
cMatch = regEx.Matches(line, "(?<=(?:^|,\s*)"")(?:[^""]
For Each m In cMatch
i += 1
dc = New DataColumn
dc.ColumnName = IIf(m.Value.Length = 0, colPrefix & i + 1, m.Value)
dt.Columns.Add(dc)
Next
Comma delimitted is not a good option for you in that you will have to write a great deal of logic to determin true delimeters from data :(...
If possible could you delimit with different character... Say a tilde ~ ...
That is the easiest answer and would simple use code similar to that provided by Mikal613 :)...
Good luck,
Dante