Solved

editing CSV file in VB. Visual Studio 2010 used.

Posted on 2010-08-18
14
781 Views
Last Modified: 2013-11-27
I have a CSV file that looks like this:
a,b,c,d,e,f
1,2,3,4,5,6

Can anyone help me figure out a way to eliminate Nth(lets say 4th) column from every line in VB?

I found a way to parse it online which looks like this:

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

        Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser("c:\test.csv")
            MyReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
            'set the delimiter to any value
            MyReader.Delimiters = New String() {","}

            Dim currentRow As String()

            'Loop through all of the fields in the file.
            'If any lines are corrupt, report an error and continue parsing.
            While Not MyReader.EndOfData
                Try
                    currentRow = MyReader.ReadFields()
                    'ouput the second value of each line to show that the file has been parsed.
                    currentRow(4)

                    MsgBox(currentRow(4))
                                   Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
                    MsgBox("Line " & ex.Message & _
                    " is invalid. Skipping")
                End Try

            End While

        End Using

    End Sub

--------------------------

Works pretty well, however, I need to not show this value, but DELETE it.

thank you,
0
Comment
Question by:learn_it
14 Comments
 
LVL 4

Expert Comment

by:Athar Syed
ID: 33471358
An long but easier way would be to populate a dataset object and then delete the column fron the file and then resave the data back to csv.
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 33471449
Declare your string builder BEFORE the using statement
        Dim sBuilder As New System.Text.StringBuilder

Then in the try ..catch build up your string
        Try
            currentRow = MyReader.ReadFields()
            sBuilder.Append(currentRow(1) & "," & currentRow(2) & "," & currentRow(3) & "," & currentRow(5) & vbCrLf)
        Catch ex As Exception

        End Try

As suggested, delete the old file then resave it. I would prefer saving the new file to a different name!

        System.IO.File.Delete("C:\test.csv")
        System.IO.File.WriteAllText("C:\test.csv", sBuilder.ToString)
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 33471492
Here's another approach:
    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

        Dim FileName As String = "c:\some\file.txt"

        Dim lines() As String = System.IO.File.ReadAllLines(FileName)

        Dim values As New List(Of String)

        For i As Integer = 0 To lines.GetUpperBound(0)

            values.Clear()

            values.AddRange(lines(i).Split(","))

            values.RemoveAt(3) ' Rmove 4th column

            lines(i) = String.Join(",", values.ToArray)

        Next

        System.IO.File.WriteAllLines(FileName, lines)

    End Sub

Open in new window

0
 

Author Comment

by:learn_it
ID: 33471600
The real worksheet has 60 columns, therefore, manually typing them all out in the script would be a LOT of work. I will try Idle_mind's case and see how it works tomorrow morning...
0
 

Author Comment

by:learn_it
ID: 33483873
Idle mind,
the code work great.
My last question would be: what would be a proper programming way to delete 3 columns instead of one. i tried to run 3 consecutive FOR statemenet deleting one column at a time because it wasn't deleting properly after going through the first three lines of numbers. however, that didn't help the issue either.
I need to delete columns 100, 88, and 85 to make format look the same.

I have attached 2 actual files that I am trying to match so that you everyone could understand exactly what I am talking about.

I need the columns of TEST.CSV to match the columns of BEEZLEY.CSV in order for newly generated CSVs to import properly in my design program.

thank you
test.csv
BEEZLEY-STEWART-ENTRY.csv
0
 

Author Comment

by:learn_it
ID: 33483900
This is what I have. I start from the last column in order to avoid shifting columns to the right BEFORE deleting other columnt to the right if that makes sense.

 Dim FileName As String = "c:\test.csv"
        Dim lines() As String = System.IO.File.ReadAllLines(FileName)
        Dim values As New List(Of String)
        For i As Integer = 0 To lines.GetUpperBound(0)
            values.Clear()
            values.AddRange(lines(i).Split(","))
            values.RemoveAt(100) ' Rmove 101st column
            lines(i) = String.Join(",", values.ToArray)
        Next

        For i As Integer = 0 To lines.GetUpperBound(0)
            values.Clear()
            values.AddRange(lines(i).Split(","))
            values.RemoveAt(77) ' Rmove 78th column
            lines(i) = String.Join(",", values.ToArray)
        Next

        For i As Integer = 0 To lines.GetUpperBound(0)
            values.Clear()
            values.AddRange(lines(i).Split(","))
            values.RemoveAt(85) ' Rmove 86th column
            lines(i) = String.Join(",", values.ToArray)
        Next

        System.IO.File.WriteAllLines(FileName, lines)
0
 
LVL 85

Accepted Solution

by:
Mike Tomlinson earned 500 total points
ID: 33484426
"I need to delete columns 100, 88, and 85 to make format look the same."

You can do it all in the same For Loop:
    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

        Dim FileName As String = "c:\test.csv"

        Dim lines() As String = System.IO.File.ReadAllLines(FileName)

        Dim values As New List(Of String)

        For i As Integer = 0 To lines.GetUpperBound(0)

            values.Clear()

            values.AddRange(lines(i).Split(","))



            ' remove all unwanted columns in backwards order 

            ' so you don't have to adjust the column numbers as you go

            values.RemoveAt(99) ' Remove 100th column

            values.RemoveAt(87) ' Remove 88th column

            values.RemoveAt(84) ' Remove 85th column



            lines(i) = String.Join(",", values.ToArray)

        Next

        System.IO.File.WriteAllLines(FileName, lines)

    End Sub

Open in new window

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:learn_it
ID: 33487476
ok, I have figured out where my problem is.
Some of the fields have values like "Februare, 2010"
The script counts those commas as delimiters as well which is wrong.
Can we somehow 'skip' these particular cases. Otherwise, i was thinking about deleting these 3 columns via script in excel, I guess.
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 33487490
Are the commas for these values actually INSIDE quotes?
0
 

Author Comment

by:learn_it
ID: 33487587
yes
0
 

Author Comment

by:learn_it
ID: 33487663
you can look at test.csv
if you scroll to the right enough, you will see a value of FEBRUARY, 2011 in the cell(in excel)
in notepad it will look like this:
"something","something","something","February, 2011","sdfsdf"

0
 

Author Comment

by:learn_it
ID: 33487719
I am looking at the csv file in notepad again and it looks like there is no easy way to split it in text mode vs excel mode.

1)some EMPTY columns are defined are "","", the otthers are just ,,

2)I thought about using "," for a delimiter value, but cells with dates do not have "" so it won't count right either

3) If we go excel route and delete column CV, and the other two, then I am wondering how I could provision it for Excel 2007 AND 2003.
0
 
LVL 85

Expert Comment

by:Mike Tomlinson
ID: 33487994
Well...you can try going BACK to the TextFieldParser() approach and setting the HasFieldsEnclosedInQuotes() property to True:
http://msdn.microsoft.com/en-us/library/4y78s7xa(VS.80).aspx

    "Denotes whether fields are enclosed in quotation marks when parsing a delimited file."

Not sure it it will bomb since some of the fields do NOT have quotes!

Try it out and let us know how it goes:
    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

        Dim FileName As String = "c:\some\file.txt"

        Dim tfp As New Microsoft.VisualBasic.FileIO.TextFieldParser(FileName)

        tfp.TextFieldType = FileIO.FieldType.Delimited

        tfp.Delimiters = New String() {","}

        tfp.HasFieldsEnclosedInQuotes = True



        Dim lines As New List(Of String)

        Dim values As New List(Of String)

        While Not tfp.EndOfData

            values.Clear()

            values.AddRange(tfp.ReadFields)



            values.RemoveAt(99) ' Remove 100th column

            values.RemoveAt(87) ' Remove 88th column

            values.RemoveAt(84) ' Remove 85th column



            lines.Add(String.Join(",", values.ToArray))

        End While

        System.IO.File.WriteAllLines(FileName, lines.ToArray)

    End Sub

Open in new window

0
 

Author Comment

by:learn_it
ID: 33493462
Ended up dealing with it via Excel in invisible mode. delimiters don't matter then. Everything works just as expected.

Thanks to all for help and ideas.
---------------------------------------------------------

Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click

        Dim strFileName, objExcel, objWorkbook
        strFileName = "c:\test.csv"

        objExcel = CreateObject("Excel.Application")
        objExcel.DisplayAlerts = False
        objWorkbook = objExcel.Workbooks.Open(strFileName)
        objExcel.Visible = False

        objExcel.ActiveSheet.Columns("CW").Delete()
        objExcel.ActiveSheet.Columns("CJ").Delete()
        objExcel.ActiveSheet.Columns("CH").Delete()
        objExcel.ActiveWorkbook.Save()
        objExcel.Quit()
        MsgBox("File has been successfully processed!")


    End Sub
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Vbscript 8 79
Custom auto number 6 59
Run a .bat file wich is inside a .vbs file 8 24
Update Access FrontEnd by Version # 9 23
With most software applications trying to cater to multiple user needs nowadays, the focus is to make them as configurable as possible. For e.g., when creating Silverlight applications which will connect to WCF services, the service end point usuall…
Introduction During my participation as a VBScript contributor at Experts Exchange, one of the most common questions I come across is this: "I have a script that runs against only one computer. How can I make it run against a list of computers in …
This Micro Tutorial hows how you can integrate  Mac OSX to a Windows Active Directory Domain. Apple has made it easy to allow users to bind their macs to a windows domain with relative ease. The following video show how to bind OSX Mavericks to …
This is Part 3 in a 3-part series on Experts Exchange to discuss error handling in VBA code written for Excel. Part 1 of this series discussed basic error handling code using VBA. http://www.experts-exchange.com/videos/1478/Excel-Error-Handlin…

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now