endrec
asked on
Parsing txt/cvs values with a comma delimiter, but exclude items in quotes like "Company, Inc."
I am having an issue with parsing a csv file with a comma delimiter. There are fields like “Name, Inc.” (they have the quotes in the actual file) which I do not want broken up with the line/file is parsed. How would you do that?
The streamreader is all set up, here is my split statement.
Dim strDelimiter As String = ","
Dim splitout = Split(strLine, strDelimiter)
"Company, Inc.",value2,value3,"value ,should,be valid"
The array values I want would be:
"Company, Inc"
value2
value3"value,should,bevali d"
The array values I current get are:
"Company,
Inc"value2
value3
"value
should
bevalid"
The streamreader is all set up, here is my split statement.
Dim strDelimiter As String = ","
Dim splitout = Split(strLine, strDelimiter)
"Company, Inc.",value2,value3,"value
The array values I want would be:
"Company, Inc"
value2
value3"value,should,bevali
The array values I current get are:
"Company,
Inc"value2
value3
"value
should
bevalid"
Given your example input string, the following code works....but it's probably not the best to use on a large csv file.
Dim indxStart, indxEnd As Integer
Dim strTemp As String
Dim strReplace As String = "**"
indxStart = strLine.IndexOf("""")
indxEnd = strLine.IndexOf("""", indxStart + 1)
While indxEnd > 0
strTemp = strLine.Substring(indxStar t + 1, indxEnd - indxStart - 1).Replace(",", strReplace)
strLine = strLine.Remove(indxStart + 1, indxEnd - indxStart - 1)
strLine = strLine.Insert(indxStart + 1, strTemp)
indxStart = strLine.IndexOf("""", indxEnd + strReplace.Length)
indxEnd = strLine.IndexOf("""", indxStart + 1)
End While
Dim strDelimiter As String = ","
Dim splitout As Array = Split(strLine, strDelimiter)
For i As Integer = 0 To splitout.Length - 1
splitout(i) = CStr(splitout(i)).Replace( strReplace , ",")
Next
Dim indxStart, indxEnd As Integer
Dim strTemp As String
Dim strReplace As String = "**"
indxStart = strLine.IndexOf("""")
indxEnd = strLine.IndexOf("""", indxStart + 1)
While indxEnd > 0
strTemp = strLine.Substring(indxStar
strLine = strLine.Remove(indxStart + 1, indxEnd - indxStart - 1)
strLine = strLine.Insert(indxStart + 1, strTemp)
indxStart = strLine.IndexOf("""", indxEnd + strReplace.Length)
indxEnd = strLine.IndexOf("""", indxStart + 1)
End While
Dim strDelimiter As String = ","
Dim splitout As Array = Split(strLine, strDelimiter)
For i As Integer = 0 To splitout.Length - 1
splitout(i) = CStr(splitout(i)).Replace(
Next
If you control the input file (i.e. you are generating it from somewhere else), the easiest method is to use something other than a comma as the delimiter when you output the file. | is a good choice.
Then, your split becomes simple again.
But Bob probably knows of some cool 2.0 function that handles just this case for you....and that I've overlooked.
Then, your split becomes simple again.
But Bob probably knows of some cool 2.0 function that handles just this case for you....and that I've overlooked.
ASKER
I have both versions 1.0 (1.1) and 2.0 of .NET
I do not have control over the input file, it must use commas to seperate the fields coming into the sub/function.
Fields can have values like "this with spaces and periods and other commas,,, and this should still only count as one field/part of the array since it is inside of quotes."
I do not have control over the input file, it must use commas to seperate the fields coming into the sub/function.
Fields can have values like "this with spaces and periods and other commas,,, and this should still only count as one field/part of the array since it is inside of quotes."
Have a look at this thread
https://www.experts-exchange.com/questions/21489283/Reading-in-Quoted-CSV-files.html
Roger
https://www.experts-exchange.com/questions/21489283/Reading-in-Quoted-CSV-files.html
Roger
If you have 2005, then check this one out:
https://www.experts-exchange.com/questions/21858837/CSV-import-to-DB-through-VB-NET.html
Bob
https://www.experts-exchange.com/questions/21858837/CSV-import-to-DB-through-VB-NET.html
Bob
ASKER
None of these solutions are working.
ASKER
I increased the point value.
In what way are "none of these solutions" not working? What does each of them produce and how does it differ from what you want?
Unless/until we know that, we're unlikely to be able to help further.
Roger
Unless/until we know that, we're unlikely to be able to help further.
Roger
ASKER
This is what the actual input line is:
Soft Agreement 1-1-01,,,License For Software,Outbound,Ok,Ok – 100%,,1-2-01,, Four,Six, Donaldw, Software, Company Inc.,1,Contact Ray,No,68 Item T+30,5300
The function seems to ignore blank values (not something I would like) and seperate some commas into their own index in the array.
This is the output of the array that the function creates from the line above.
Soft Agreement 1-1-01
,
License For Software
Ok
Ok – 100%
1-2-01
Four
Six
Donaldw
Software
Company Inc.
1
Contact Ray
No
68 Item T+30
5300
The function i'm using:
Private Function ParseQuotedCSV2(ByVal strInput As String) As ArrayList
Dim a As New ArrayList
Dim s As String() = strInput.Split(",")
Dim i As Integer
For i = s.Length - 1 To 1 Step -1
If s(i).TrimEnd.LastIndexOf(C hr(34)) = s(i).TrimEnd.Length - 1 Then
If s(i).TrimStart.IndexOf(Chr (34)) <> 0 Then
s(i - 1) = s(i - 1) & "," & s(i)
s(i) = "|||" 'This is an 'ignore' marker which needs to be set to something that would never appear in the input string
End If
End If
Next
For i = 0 To s.Length - 1
If s(i) <> "|||" Then 'See comment above
If s(i).TrimStart.IndexOf(Chr (34)) = 0 Then
s(i) = s(i).Remove(s(i).IndexOf(C hr(34)), 1)
s(i) = s(i).Remove(s(i).LastIndex Of(Chr(34) ), 1)
End If
a.Add(s(i).ToString)
End If
Next
Return a
End Function
Soft Agreement 1-1-01,,,License For Software,Outbound,Ok,Ok – 100%,,1-2-01,, Four,Six, Donaldw, Software, Company Inc.,1,Contact Ray,No,68 Item T+30,5300
The function seems to ignore blank values (not something I would like) and seperate some commas into their own index in the array.
This is the output of the array that the function creates from the line above.
Soft Agreement 1-1-01
,
License For Software
Ok
Ok – 100%
1-2-01
Four
Six
Donaldw
Software
Company Inc.
1
Contact Ray
No
68 Item T+30
5300
The function i'm using:
Private Function ParseQuotedCSV2(ByVal strInput As String) As ArrayList
Dim a As New ArrayList
Dim s As String() = strInput.Split(",")
Dim i As Integer
For i = s.Length - 1 To 1 Step -1
If s(i).TrimEnd.LastIndexOf(C
If s(i).TrimStart.IndexOf(Chr
s(i - 1) = s(i - 1) & "," & s(i)
s(i) = "|||" 'This is an 'ignore' marker which needs to be set to something that would never appear in the input string
End If
End If
Next
For i = 0 To s.Length - 1
If s(i) <> "|||" Then 'See comment above
If s(i).TrimStart.IndexOf(Chr
s(i) = s(i).Remove(s(i).IndexOf(C
s(i) = s(i).Remove(s(i).LastIndex
End If
a.Add(s(i).ToString)
End If
Next
Return a
End Function
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Bob