We help IT Professionals succeed at work.

VB.Net - Import CSV File using Split

Jimbo99999
Jimbo99999 asked
on
2,518 Views
Last Modified: 2014-05-30
Good Day Experts!

I am using Split in my Import of CSV file to seperate the string of data by the commas.  My problem is that within the "cell" of data is a comma.  

How do I replace the comma with a space before I do the Split?

Thanks,
jimbo99999
Comment
Watch Question

Daniel KlineSr. SharePoint Developer

Commented:
"CSV" is "comma separated values"  There is no way to differentiate which commas are cell borders and which a content.  Is it possible to save the file with a different delimiter "~", tilde, is usually pretty safe.

Author

Commented:
Ok, here is what I found out.  I am using the following line:

For Each sLine As String In stream.ReadToEnd.Split(Environment.NewLine)

I was trying to then do a Split on a comma.

When I look at sLine in the Immediate window,  it appears there are is a a dobule quote around the cell data that contains a comma...here is a sample:

,Maria Korakidou,"10-12 Nikodemou Milona, Paphos",P.o.x.60040,

So, I guess I cannot use Split. I have to parse apart "manually" somehow.  Or maybe replace the  comma if found between the double quotes then do a Split.

What do you all think how I can approach this?

Thanks,
jimbo99999
Daniel KlineSr. SharePoint Developer

Commented:
Take the File back into Excel and save it as a "Text Tab delimited ("*.txt")".

Now, split on a tab character and you should be on your way.
CERTIFIED EXPERT
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
Fernando SotoRetired
CERTIFIED EXPERT
Distinguished Expert 2017

Commented:
Hi jimbo99999;

Visual Basic has a class called TextFieldRarser which can handle the comma within quote. You configure the parser as needed and then read the file line by line. Each line is returned as an array of cells. Please check out the link above.

Author

Commented:
Good Day Experts

Goran:

I am unfamiliar with everything including and after .Cast.  The "function" term is causing an error.  

Fernando Soto:

I have tried the TextFieldParser and it looked reall promising.  But I found another issue in the data.  I have this scenario when I interrogate the sLine string:

,8839094,"Wood Gallery Oversized Picture Frame, 25 x 25", Espresso stain",8.90

The ItemDescription is ,"Wood Gallery Oversized Picture Frame, 25 x 25", Espresso stain",    But the TextFieldParser is grabbing  "Wood Gallery Oversized Picture Frame, 25 x 25" as the ItemDescription and Espresso stain" is getting treated as the next field which is ItemWeight.  

Do you see anything else I can do?  This is feeling like the Customer sending the data needs to get it cleaned up.

Thanks,
jimbo99999
Daniel KlineSr. SharePoint Developer

Commented:
You could still use Excel to clean it up yourself using my earlier suggestion of saving it as a tab delimitted file.

Author

Commented:
I totally forgot to go back to that.  I tried opening and resaving and then was distracted with all the other User needs.  

I was menaing to ask do I have to open it and resave manually or is there a way to do it programatically?

Thansk,
jimbo9999
Retired
CERTIFIED EXPERT
Distinguished Expert 2017
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
Hello:

It is not fixed width fields.  

I am open formatting modification suggestions as I am just about out of options here.  I just tried saving seperated by Tab and it seprated like 95% of the lines and then there is groups of 5-10 that still have comma seperating the fields!

What is your suggestion?

I am concerned with the issue of the dbl quote within the quoted string:

 ,"Wood Gallery Oversized Picture Frame, 25 x 25", Espresso stain",    

Thanks,
jimbo99999
Fernando SotoRetired
CERTIFIED EXPERT
Distinguished Expert 2017

Commented:
Hi jimbo99999;

Being open to modifying the file if you escape the double quote inside a quoted field like so :

 ,"Wood Gallery Oversized Picture Frame, 25 x 25"", Espresso stain",

Then the TextFieldParser will correctly interpret the double quote and insert a single quote in the field so that in your program the text will be :

 ,"Wood Gallery Oversized Picture Frame, 25 x 25", Espresso stain",

Author

Commented:
Hello:

I apologize but I am not quite sure what you are meaning by "escaping" the the double quote".

Does that mean to just put another double quote right after the already present double quote?

Thanks,
jimbo99999
Fernando SotoRetired
CERTIFIED EXPERT
Distinguished Expert 2017

Commented:
Yes, as stated,  ,"Wood Gallery Oversized Picture Frame, 25 x 25"", Espresso stain", note the double quote after the second 25 it has "" and NOT ". That double quote is taken literally and gives it no special meaning.

Author

Commented:
Thanks...trying right now.
CERTIFIED EXPERT

Commented:
It seems that you are using framework 2, that doesn't have LINQ methods...

Here is a version of the code that uses a plain loop instead:

Dim matches = Regex.Matches(sLine, "\s*""(?'v'[^""]*)""\s*|\s*(?'v'[^,]+)\s*")
Dim values As New List(Of String)
For Each m As Match In matches
  values.Add(m.Groups("v").Value)
Next

Open in new window

Author

Commented:
Good Day Experts!

Ok, with the TextFileParser I came across 52 records that would not load...the rest of the 21,000 were ok and imported into my SQL Table.

I then saved the file as Tab delimited and came across 63 records that would not load...the rest of of the 21,000 were ok and imported into my SQL Table.

Goran: I will try your method next.

Thanks,
jimbo99999
Fernando SotoRetired
CERTIFIED EXPERT
Distinguished Expert 2017

Commented:
Do you know what the issue was with the lines that did not load?
Daniel KlineSr. SharePoint Developer
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
On the CSV attempt, it was like there were special characters making the row height is Excel really large on some of them...never seen this before.  


On the Tab delimited ones, the records still had commas seperating the "cell Values". So for some reason when I opened the CSV and resaved with a Tab delimiter those did not get converted.

In my code here is how I am checking hte end of the line:

Using file As New IO.FileStream(Path, FileMode.Open, FileAccess.Read)
           Using streamDup As New IO.StreamReader(file)
                     For Each sLine As String In streamDup.ReadToEnd.Split(Environment.NewLine)
                                Dim trimChars As Char() = {Chr(10), Chr(13)}
                                sLine = sLine.Trim(trimChars)
                      Next
            End Using
End Using

Then again that just is taking the Char off the end.  Is that ok or should I be doing it a different way?

Thanks,
jimbo99999
Daniel KlineSr. SharePoint Developer

Commented:
The CSV special character were probably Control Characters (non printing).  They indicate careless entry.  For the Tab Delimited I'm confused.  I can't recreate the issue.

Is there sensitive information in the file, or can you upload a sample of it.  I'd like to see what the data looks like.  I've done these conversions many times and it should be very easy.  If you can get a good sampling of the error rows, that would be helpful.

Author

Commented:
Hello:

It is a bit sensitive. I am pressing on and trying again here.  I did notice that when I open the CSV file to save as Tab delimited text file another issue has popped up that I did not notice.  2.0016E+15 is getting saved to the database instead of 2001601005905940.  I have tried opening Excel and setting the columns to Text before opening the file but that does not work.  

If I just run the CSV file without opening to re-save as Tab delimeted and use TextFilePArser to split on  comma 2001601005905940 gets saved to the database.  

Is it possible to save as Tab delimited Text without opening CSV?

Thanks,
jimbo99999
Daniel KlineSr. SharePoint Developer

Commented:
So, if I understand you, you are opening Excel, importing the data using the Data Menu.  Try Setting the column type as number with 0 decimal points.  It works for me.  You can see the text in the screen shot.  When you view your "txt" file does it look like a string of numeric digits or in scientific notation.

If it's a numeric digit string, make sure you are casting to a string in TextFieldParser.  It could be that VB in interpreting the value as numeric.
DecZeroExample.png
CERTIFIED EXPERT

Commented:
You said that you would try my method. Any progress on that?

Author

Commented:
Good Day Everyone

Goran:
 I have tried your suggestion but am getting an error. The values in values.Add is causing this error:

"Object reference not set to an instance of an object"

Do I need to add something else to make this work?

Daniel:
I am opening the CSV just to save it as tab delimited text file.  This is causing the 2001601005905940 to be saved 2.0016E+15.  So, when I open the Tab delimited text file, 2.0016E+15 is in there not 2001601005905940. Then when I read the file 2.0016E+15 is retrieved and subsequently saved to the db.  If I never open the CSV file and use the following code the data is saved properly to the db:

                                        Using rdr As New StringReader(sLine)
                                            Using parser As New TextFieldParser(rdr)
                                                parser.TextFieldType = FieldType.Delimited
                                                parser.Delimiters = New String() {","}
                                                parser.HasFieldsEnclosedInQuotes = True
                                                sRecords = parser.ReadFields
                                            End Using
                                        End Using

Thanks,
jimbo99999
CERTIFIED EXPERT

Commented:
If the variable 'values' causes that error, then there is something missing from the code. The variable is assigned a list when it is declared.

The only other way that I managed to repeat the expection is it the variable 'sLine' is null.
Daniel KlineSr. SharePoint Developer

Commented:
Hi jimbo99999,

It is indeed strange that excel is working differently for you.  I need to chew on this one and see if there is another approach.
Daniel KlineSr. SharePoint Developer

Commented:
In the interim check out this approach to stripping special characters out of the original file.  It may be that embedded control codes are raising hell with excel and your code.  http://www.freevbcode.com/ShowCode.asp?ID=31

Author

Commented:
Thanks for the efforts.  I am back to this project for the week.  Reports are that the Customer will not be approached to do any kind of data cleanup prior to sending it to us.  So, I need to find the best way to handle this.  

I will keep trying all the approaches everyone has offered.

Thanks for your continued assistance,
jimbo99999

Author

Commented:
Good Day
While I have not found a programmatic solution to this issue, I am going to go ahead and close .  All three of you have provided excellent information that I was not aware  of and will keep it in my knowledge base.

Thanks again,jimbo99999

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.