Solved

Parse csv with double quotes in vb.net

Posted on 2012-04-13
13
2,010 Views
Last Modified: 2012-04-13
How can I extend the function below to escape double quotes found in field data, eg:
"some data", "some more data with a size eg 8" where the double quotes need escaping", "some more data"

    
Private Function ParseLine(ByVal oneLine As String) As String()
        ' Returns an array containing the values of the comma-separated fields.
        ' This pattern actually recognizes the correct commas.
        ' The Regex.Split() command later gets text between the commas.
        Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern)
        Return r.Split(oneLine)
    End Function

Open in new window

0
Comment
Question by:ncw
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
13 Comments
 
LVL 17

Expert Comment

by:nepaluz
ID: 37841813
Change your line 8 above to
Return r.Split(oneLine).Replace("""", String.Empty)

Open in new window

(also see article here.)
0
 
LVL 1

Author Comment

by:ncw
ID: 37841838
Vb IDE says: Replace is not a member of 'System.Array'
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 37841855
Hmmm!
    Private Function ParseLine(ByVal oneLine As String) As String()
        Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern)
        Dim Result = r.Split(oneLine)
        Array.ForEach(Result, Function(x) x.Replace("""", String.Empty))
        Return Result
    End Function

Open in new window

0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 5

Expert Comment

by:hafeezmca
ID: 37841857
Hi, you can do that without a function too.
As your concern is with " just pass it to the replace function like below:

dim str as string
str=yourfield.replace(""","")
0
 
LVL 83

Expert Comment

by:CodeCruiser
ID: 37842065
Try this

Dim strLine as string = "your text here"
strLine = strLine.Replace(chr(34) & "," & chr(34), "|")

return strLine.Split("|")
0
 
LVL 1

Author Comment

by:ncw
ID: 37842096
@nepaluz:

if the line of data is:

""9","955.351.401.51","","LHF DISC 350MM FOR 18" WHEELS CAYE","PORSCHE","75.00","0.0000""

then after Result = r.Split(oneLine)
Result(0) gives:

""9","955.351.401.51","","LHF DISC 350MM FOR 18" WHEELS CAYE""

So the split has already  split in the wrong place as Result(0) should be ""9"" therefore Array.ForEach(Result, Function(x) x.Replace("""", String.Empty)) is too late.
0
 
LVL 17

Accepted Solution

by:
nepaluz earned 500 total points
ID: 37842175
You have some strange file structure.
OK, I take it that the line is enclosed in double quotes, so this will handle that.
    Private Function ParseLine(ByVal oneLine As String) As String()
        Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern)
        oneLine = oneLine.Replace("""" & "," & """", vbTab).Replace("""", String.Empty)
        Return r.Split(oneLine.Replace(vbTab, ","))
    End Function

Open in new window

0
 
LVL 1

Author Comment

by:ncw
ID: 37842185
0
 
LVL 1

Author Comment

by:ncw
ID: 37842302
@nepaluz: Not quite sure what you're doing with

oneLine = oneLine.Replace("""" & "," & """", vbTab).Replace("""", String.Empty)

You seem to be using a tab as a temporary deliminator. Not sure what pattern you are replacing as there are probably some double quotes being used as escape characters. BUT IT WORKS!!
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 37842844
What that does is replacing delimiters with tabs and removing all double quotes remaining in the string after that (and of-course replace the tabs with commas when returning!).

On a side note, the proof of any pudding is in the eating, that the code suggestion works is the proof that it is correct. Having said that, I'd use the FileIO.TextFieldParser to parse any delimited file.
0
 
LVL 1

Author Closing Comment

by:ncw
ID: 37842909
Thanks very much you've been very helpful!
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 37843032
You are welcome!
Having had a little mull over the code suggestion, I realised that my work-aroundactually renders the Regex redundant! Consider this:
Private Function ParseLine(ByVal oneLine As String) As String()
    oneLine = oneLine.Replace("""" & "," & """", vbTab).Replace("""", String.Empty)
    Return oneLine.Split(vbTab)
End Function

Open in new window

Would not shed any tears for regex though, sure there is adequate of it pre-loaded in the Split method!
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 37843051
Of-course, the above will only hold its own IF there are no fields with embeded commas. (ooops ... may have spoken too soon!)
0

Featured Post

Salesforce Made Easy to Use

On-screen guidance at the moment of need enables you & your employees to focus on the core, you can now boost your adoption rates swiftly and simply with one easy tool.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
VB.Net Data Class 1 41
reading an excel file vb.net 2 43
sql server connection string in config file 4 32
Need Help With GDI+ in VB.Net 8 29
Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
This article shows how to deploy dynamic backgrounds to computers depending on the aspect ratio of display
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

742 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question