Link to home
Start Free TrialLog in
Avatar of newbie27
newbie27Flag for United Kingdom of Great Britain and Northern Ireland

asked on

COMMA SEPARATED CSV FILE - VB.net

Hello Experts
I am trying to process the CSV file and storing the data into the database.
One of the column has got a string which could consist of "comma" character

Ex: Hello, How are you?

I am splitting up using ","  
Please can someone advise how can I go about getting away with it?
Thanks
 


If (flvCSV.FileName.Contains(".csv")) Then
                SaveFile()
                myReader = File.OpenText(Server.MapPath("../../bin/temp/") & "/" & flvCSV.FileName)
                myReader.ReadLine()
                While myReader.Peek <> -1
                    myRow = tblTable.NewRow
                    fieldValues = myReader.ReadLine().Split(",")
                    myRow.Item("VoucherNo") = fieldValues(0).ToString()
EndWhile....

Open in new window

Avatar of Tom Beck
Tom Beck
Flag of United States of America image

I'm speaking from a recent experience here.

Do you have any control over the CSV file? Can you change how it's generated in the first place? If so, have it use a tilde or pipe symbol for a delimiter. Or, generate it with qualifiers around each field (quotes).
Avatar of newbie27

ASKER

Hi,

Thanks for your input.

Apparetnly I have no control over this CSV file, one of the client is sending the data to me and I even asked them if they can use something else in place of "comma" in the message.

Please advise

Thanks
Avatar of gamarrojgq
gamarrojgq

Hi,

It is hard to avoid this since comma is your delimter char, unless you know exactly which field and in what part of that field is the extra comma, you should better use another delmiter char.

If you have control over the creation of the csv file, then you can apply some of this basic rules

http://en.wikipedia.org/wiki/Comma-separated_values

And with a file formatted correctly, you can use ADO.NET to read the CSV File
If you are forced to work with the CSV file as is, import it into Microsoft Excel or another spreadsheet program. This will make it easy to spot rows that are misaligned with the header because of comma inclusions. Correct the erroneous rows and export out as a new CSV using a different delimiter.
Hi

I know the column which possible could consist one ore more comma's in the Mesage column, however there is no such pattern for it, this is completely dynamic and depends on the customers interests.

I am being forced to use the same csv file as they cannot do any changes on their end.

Please advise how can I still be able to use the "message" column and take the full data as one column.

Thanks
Assuming the "message" column is the only column that could possible include commas in the field data.

The "message" column will be the same column number in each row. So, write a sub routine that reads one row at a time, search across the row using IndexOf(",") for each comma it encounters until it is at the "message" field. Save the remainder of the row to a string. Read through that string using LastIndexOf(",") and chop off the fields that come after the "message" field. Now you have the string with only the message field. Make sense?
ok, if you are sure that there is only one column that can have extra commas, then you can try the following

Lets say that you CSV File have 5 columns, and the number 3 have the extra comma

Column1        Column2          Column3                          Column4      Column5
1          ,   some text,       Hello, How are you?           ,  23.43    ,   another text
2          ,   text too,        Text without extra comma      ,  445.5    ,   some more text

When you Split Line 1 you will have this
fieldValues(0) = 1
fieldValues(1) = some text
fieldValues(2) = Hello
fieldValues(3) = How are you?
fieldValues(4) = 23.43
fieldValues(5) = another text

Then when you SPLIT the line, you can check for fieldValues.Length value and if it is more  than 4 (since Count is Zero Base),
then you know you have to join the column 3 with column 4, and column 5 will be actually column 4 and column 6 will be  Column 5

So your code should be
 myRow.Item("VoucherNo") = fieldValues(0).ToString()
 myRow.Item("fieldname") = fieldValues(1).ToString()
 myRow.Item("fieldname") = fieldValues(2).ToString() & "," & fieldValues(3).ToString()
 myRow.Item("fieldname") = fieldValues(4).ToString()
 myRow.Item("fieldname") = fieldValues(5).ToString()


But, if fieldValues.Length is equal to 5, then you know that there is no extra comma in the field and you can assign the values normally

When you Split Line 2 you will have this
fieldValues(0) = 2
fieldValues(1) = text too
fieldValues(2) = Text without extra comma
fieldValues(3) = 445.5
fieldValues(4) = some more text

Then your code should be
 myRow.Item("VoucherNo") = fieldValues(0).ToString()
 myRow.Item("fieldname") = fieldValues(1).ToString()
 myRow.Item("fieldname") = fieldValues(2).ToString()
 myRow.Item("fieldname") = fieldValues(3).ToString()
 myRow.Item("fieldname") = fieldValues(4).ToString()


If the Field can have more that 1 extra comma, you use the same principle, if the fieldValues.Length is bigger than the normal number of columns,
you just join the column that can have the extra comma with the others until you reach the difference bettewn fieldValues.Length and the norma number of columns

See the attaced code for reference.
If (flvCSV.FileName.Contains(".csv")) Then
                SaveFile()

		Dim intNormalColumnCount as Int32 = 5 'Put here the REAL NORMAL COLUMN COUNT
		Dim intDifference As Int32
		Dim intColumnWithExtraCommas As Int16 = 2 'Put here the REAL COLUMN NUMBER THAT CAN HAVE EXTRA COMMAS

                myReader = File.OpenText(Server.MapPath("../../bin/temp/") & "/" & flvCSV.FileName)
                myReader.ReadLine()
                While myReader.Peek <> -1

                    myRow = tblTable.NewRow
                    fieldValues = myReader.ReadLine().Split(",")

		    If fieldValues.Length > (intNormalColumnCount-1) Then 'Since Length is Zero Based

			    intDifference = fieldValues.Length - (intNormalColumnCount-1)

	                    myRow.Item("VoucherNo") = fieldValues(0).ToString()
	                    myRow.Item("FieldName") = fieldValues(1).ToString()

			    'Since this is the column that can have 1 o more extra commas concatenate the exta columns
			    For intCol = intColumnWithExtraCommas To intColumnWithExtraCommas + intDifference
	                        myRow.Item("FieldName") &= fieldValues(intCol).ToString()
                            Next

	                    myRow.Item("FieldName") = fieldValues(fieldValues.Length-2).ToString() 'Since after the split this will be always one before last
	                    myRow.Item("FieldName") = fieldValues(fieldValues.Length-1).ToString() 'Since after the split this will be always the last one

		    Else
			    'Assign the values normally since no extra comma exist
	                    myRow.Item("VoucherNo") = fieldValues(0).ToString()
	                    myRow.Item("FieldName") = fieldValues(1).ToString()
	                    myRow.Item("FieldName") = fieldValues(2).ToString()
	                    myRow.Item("FieldName") = fieldValues(3).ToString()
	                    myRow.Item("FieldName") = fieldValues(4).ToString()
		    End If

EndWhile....

Open in new window

Hi TommyBoy

Thanks for your advise.

I think before I actually start reading through I would need to replace the "commas" from the dedication message?

Field(29) is the fixed column for "DedicationMessage" column and its only this field I have to check.

Please can you advise how can I make sure to replace "commas" to a pipe etc, process and whilst storing in the database it revert back to the comma character

Thank you for the help

Please the attached code snippet am using.

Thanks
Dim myRow As DataRow
            Dim OrderID As Integer
            msg = ""
            Dim fieldValues As String()

            If (flvCSV.FileName.Contains(".csv")) Then
                SaveFile()
                myReader = File.OpenText(Server.MapPath("../../bin/temp/") & "/" & flvCSV.FileName)
                myReader.ReadLine()
                While myReader.Peek <> -1
                    myRow = tblTable.NewRow
                    fieldValues = myReader.ReadLine().Split(",")
                    myRow.Item("VoucherNo") = fieldValues(0).ToString()
                    
                    myRow.Item("DedicationMessage") = fieldValues(29).ToString()
                    myRow.Item("CreatedBy") = fieldValues(30).ToString()
                    
                    tblTable.Rows.Add(myRow)
                End While
                myReader.Close()
                For j As Integer = 0 To tblTable.Rows.Count - 1
                   
                                Dim drRow As DataRow
                                drRow = tblTable.Rows(j)
                                GetStaticPatterns(GiftCodeID, drRow)
                                'getDynamicPatterns(GiftCodeID, drRow)
                                AddFTMatches(FT_ID)
                            End If
                        End If
                        Products = ""
                     End If
                Next
                lblMsg.Text = msg
            End If
        Catch ex As Exception
            lblMsg.Text = msg & " Error:" & ex.ToString()
            myReader.Close()
        End Try

Open in new window

I'm guessing that you missed the code provided by someone else at #35259821. That idea is better than mine. You should go with that. There's no need in either case to change the commas to another delimiter. If you still need me to expand on my idea, just let me know.
Hi gamarrojgq:

Thansk for your example code, much appreciated.

 >> would the statement below is dependant on the number of "comma's" available in the string?

If the message is big and if there are more than 1 comma, then it should do for all of those commas?

Ex "Hell, How, Are, You, Allan"

Please advise

Thanks
myRow.Item("FieldName") = fieldValues(fieldValues.Length-2).ToString() 'Since after the split this will be always one before last
	                    myRow.Item("FieldName") = fieldValues(fieldValues.Length-1).ToString() 'Since after the split this will be always the last one

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of gamarrojgq
gamarrojgq

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I did not notice gamarrojgq's comment when I pasted my comment and now I realized that he used the same idea before me :-)