Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Parse csv with double quotes in vb.net

Posted on 2012-04-13
13
Medium Priority
?
2,230 Views
Last Modified: 2012-04-13
How can I extend the function below to escape double quotes found in field data, eg:
"some data", "some more data with a size eg 8" where the double quotes need escaping", "some more data"

    
Private Function ParseLine(ByVal oneLine As String) As String()
        ' Returns an array containing the values of the comma-separated fields.
        ' This pattern actually recognizes the correct commas.
        ' The Regex.Split() command later gets text between the commas.
        Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern)
        Return r.Split(oneLine)
    End Function

Open in new window

0
Comment
Question by:ncw
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
13 Comments
 
LVL 17

Expert Comment

by:nepaluz
ID: 37841813
Change your line 8 above to
Return r.Split(oneLine).Replace("""", String.Empty)

Open in new window

(also see article here.)
0
 
LVL 1

Author Comment

by:ncw
ID: 37841838
Vb IDE says: Replace is not a member of 'System.Array'
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 37841855
Hmmm!
    Private Function ParseLine(ByVal oneLine As String) As String()
        Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern)
        Dim Result = r.Split(oneLine)
        Array.ForEach(Result, Function(x) x.Replace("""", String.Empty))
        Return Result
    End Function

Open in new window

0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 5

Expert Comment

by:hafeezmca
ID: 37841857
Hi, you can do that without a function too.
As your concern is with " just pass it to the replace function like below:

dim str as string
str=yourfield.replace(""","")
0
 
LVL 83

Expert Comment

by:CodeCruiser
ID: 37842065
Try this

Dim strLine as string = "your text here"
strLine = strLine.Replace(chr(34) & "," & chr(34), "|")

return strLine.Split("|")
0
 
LVL 1

Author Comment

by:ncw
ID: 37842096
@nepaluz:

if the line of data is:

""9","955.351.401.51","","LHF DISC 350MM FOR 18" WHEELS CAYE","PORSCHE","75.00","0.0000""

then after Result = r.Split(oneLine)
Result(0) gives:

""9","955.351.401.51","","LHF DISC 350MM FOR 18" WHEELS CAYE""

So the split has already  split in the wrong place as Result(0) should be ""9"" therefore Array.ForEach(Result, Function(x) x.Replace("""", String.Empty)) is too late.
0
 
LVL 17

Accepted Solution

by:
nepaluz earned 2000 total points
ID: 37842175
You have some strange file structure.
OK, I take it that the line is enclosed in double quotes, so this will handle that.
    Private Function ParseLine(ByVal oneLine As String) As String()
        Dim pattern As String = ",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))"
        Dim r As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex(pattern)
        oneLine = oneLine.Replace("""" & "," & """", vbTab).Replace("""", String.Empty)
        Return r.Split(oneLine.Replace(vbTab, ","))
    End Function

Open in new window

0
 
LVL 1

Author Comment

by:ncw
ID: 37842302
@nepaluz: Not quite sure what you're doing with

oneLine = oneLine.Replace("""" & "," & """", vbTab).Replace("""", String.Empty)

You seem to be using a tab as a temporary deliminator. Not sure what pattern you are replacing as there are probably some double quotes being used as escape characters. BUT IT WORKS!!
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 37842844
What that does is replacing delimiters with tabs and removing all double quotes remaining in the string after that (and of-course replace the tabs with commas when returning!).

On a side note, the proof of any pudding is in the eating, that the code suggestion works is the proof that it is correct. Having said that, I'd use the FileIO.TextFieldParser to parse any delimited file.
0
 
LVL 1

Author Closing Comment

by:ncw
ID: 37842909
Thanks very much you've been very helpful!
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 37843032
You are welcome!
Having had a little mull over the code suggestion, I realised that my work-aroundactually renders the Regex redundant! Consider this:
Private Function ParseLine(ByVal oneLine As String) As String()
    oneLine = oneLine.Replace("""" & "," & """", vbTab).Replace("""", String.Empty)
    Return oneLine.Split(vbTab)
End Function

Open in new window

Would not shed any tears for regex though, sure there is adequate of it pre-loaded in the Split method!
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 37843051
Of-course, the above will only hold its own IF there are no fields with embeded commas. (ooops ... may have spoken too soon!)
0

Featured Post

Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It’s quite interesting for me as I worked with Excel using vb.net for some time. Here are some topics which I know want to share with others whom this might help. First of all if you are working with Excel then you need to Download the Following …
It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

618 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question