VBScipt to remove lines with zero value

I have a .csv file.  In column 14 of the file many of the lines have a value of "0".  I need to script to look at each line and if the value in column 14 on that line is 0, delete it then write the remaining lines to a new file.
vpnsol123Asked:
Who is Participating?
 
Patrick MatthewsConnect With a Mentor Commented:
vpnsol123,

This one was trickier than I thought :)

This seems to be working now--the RegExp was not quite right before.




Dim fso, tsIn, tsOut, TheLine, test, LineArr, RegX, OutputCols, OutputCol, NewLine, Mats

OutputCols = Array(7, 8, 12, 13, 15, 25, 31, 32)

Set RegX = New RegExp
With RegX
    .Pattern = "(,|^)(""[^,]+)(,)([^""]+"")(,|$)"
    .Global = False
End With
Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("c:\test.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)

Do Until tsIn.AtEndOfStream
    TheLine = tsIn.ReadLine
    test = TheLine
    Do
        test = RegX.Replace(test, "$1$2 $4$5")
        Set Mats = RegX.Execute(test)
        If Mats.Count = 0 Then Exit Do
    Loop
    LineArr = Split(test, ",")
    If CDbl(LineArr(12)) <> 0 Then
        NewLine = ""
        For OutputCol = LBound(OutputCols) To UBound(OutputCols)
            NewLine = NewLine & "," & LineArr(OutputCols(OutputCol))
        Next
        tsOut.WriteLine Mid(NewLine, 2)
    End If
Loop

tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegX = Nothing

MsgBox "Done"





Patrick
0
 
Mike TomlinsonMiddle School Assistant TeacherCommented:
"....if the value in column 14 on that line is 0, delete it then write the remaining lines to a new file."

Could please clarify...

Do you want the original file to remain UNTOUCHED or not?

The new file should only have lines where Column 14 is NOT zero?

Also, do any of your values have a comma INSIDE the quotation marks?...or can we safely Split() on comma?
0
 
vpnsol123Author Commented:
It is preferable that the originial file remain in tact and a new file be created for the non-zero value lines.

Also, yes we can safely split on comma.

Thanks
0
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

 
Patrick MatthewsCommented:
Dim fso, tsIn, tsOut, TheLine, LineArr

Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)

Do Until tsIn.AtEndOfStream
    TheLine = tsIn.ReadLine
    LineArr = Split(TheLine, ",")
    If LineArr(13) <> 0 Then tsOut.WriteLine TheLine
Loop

tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing

MsgBox "Done"
0
 
Patrick MatthewsCommented:
If we had to be wary of fields that included commas, assuming doublequote is the text qualifier...



Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX

Set RegX = New RegExp
RegX.Pattern = """[^""]*"""
Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)

Do Until tsIn.AtEndOfStream
    TheLine = tsIn.ReadLine
    Test = RegX.Replace(TheLine, "")
    LineArr = Split(Test, ",")
    If LineArr(13) <> 0 Then tsOut.WriteLine TheLine
Loop

tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set RegExp = Nothing

MsgBox "Done"
0
 
Mike TomlinsonMiddle School Assistant TeacherCommented:
Looks like matthewspatrick has already given you a couple of great answers...  =)
0
 
Patrick MatthewsCommented:
Need a small tweak to the "have to be wary of internal commas" version, as the Global property for RegExp
is False by default:




Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX

Set RegX = New RegExp
With RegX
    .Pattern = """[^""]*"""
    .Global = True
End With
Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)

Do Until tsIn.AtEndOfStream
    TheLine = tsIn.ReadLine
    Test = RegX.Replace(TheLine, "")
    LineArr = Split(Test, ",")
    If LineArr(13) <> 0 Then tsOut.WriteLine TheLine
Loop

tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set RegExp = Nothing

MsgBox "Done"
0
 
vpnsol123Author Commented:
This is perfect.  Now to complicate it.  I am finding that the program that I am importing this into is a bit complicated.  The above logic works perfectly.  Is there an ability to have the VBScript output only fields pecified in the script.  In other words, if I only want to output columns 5,7,9,11,13 and 15 into the output file, how would I do that?

Thank you again.
0
 
Patrick MatthewsCommented:
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine

OutputCols = Array(5, 7, 9, 11, 13, 15)

Set RegX = New RegExp
With RegX
    .Pattern = """[^""]*"""
    .Global = True
End With
Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)

Do Until tsIn.AtEndOfStream
    TheLine = tsIn.ReadLine
    Test = RegX.Replace(TheLine, "")
    LineArr = Split(Test, ",")
    If LineArr(13) <> 0 Then
        NewLine = ""
        For Each OutputCol In OutputCols
            NewLine = NewLine & "," & LineArr(OutputCol)
        Next
        tsOut.WriteLine Mid(NewLine, 2)
    End If
Loop

tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set RegExp = Nothing

MsgBox "Done"
0
 
vpnsol123Author Commented:
This works very well.  The only problem that I am having now is that column 31 contains a comma in it.  It appeared as though the logic was there to split that cell but that column is actually messing up my output.  Any thoughts?
0
 
vpnsol123Author Commented:
Ok...I see the problem.  The script in its current form is searching for the comma within a column and then blanking out that column.   What I need it to do it to split the column into 2 and then to output those values or to just replace the comma with a space.
0
 
Patrick MatthewsCommented:
Try this revised code.  It tries to replace internal commas with spaces.




Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine, Mats

OutputCols = Array(5, 7, 9, 11, 13, 15)

Set RegX = New RegExp
With RegX
    .Pattern = """[^,]*,.*"""
    .Global = False
End With
Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)

Do Until tsIn.AtEndOfStream
    TheLine = tsIn.ReadLine
    Test = TheLine
    Do
        Test = RegX.Replace(Test, " ")
        Set Mats = RegX.Execute(Test)
        If Mats.Count = 0 Then Exit Do
    Loop
    LineArr = Split(Test, ",")
    If LineArr(13) <> 0 Then
        NewLine = ""
        For Each OutputCol In OutputCols
            NewLine = NewLine & "," & LineArr(OutputCol)
        Next
        tsOut.WriteLine Mid(NewLine, 2)
    End If
Loop

tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegExp = Nothing

MsgBox "Done"
0
 
vpnsol123Author Commented:
That seems to do the trick with the comma's but it also seems to have broken the part of the code that deletes the lines with a value of 0 in column 14.

Thanks
0
 
vpnsol123Author Commented:
I am getting subscript out of rage:'OutputCol' when I run this.
My output columns look like this:Array(7, 8, 12, 13, 15, 25, 31, 33)
0
 
Patrick MatthewsCommented:
vpnsol123,

It would be useful to see some sample data and/or a sample file.  EE now allows you to directly upload files
to your question.

Please be advised that once you upload a file, it can be publicly accessed, and that it may not be possible
to fully and permanently delete it.  Therefore, be very careful about posting proprietary, confidential, or
other sensitive information.  If necessary, use "fake" and/or obfuscated data in your sample.

Please note that at present EE restricts uploads to certain file types.  If your file type does not match
those in the list, you can use http://www.ee-stuff.com instead, which is not officially an EE site, but is run
by people connected to EE.

Patrick
0
 
vpnsol123Author Commented:
Below is a single sample record.  I can not upload a file.  It is patient sensitive so I creasted a sample..  You will see that the data is separated by commas.  The comma between the words Release and final is the one that I am trying to remove.

Outpatient Technical,N,Y,3127,11/06/2009 17:41:00,9,DG09-75588,Smith John,,,mim4681961,11/05/2009 04:24:00,00000,ANP31100068,1,,,N,N,11/06/2009 17:42:07:246,11/06/2009 17:42:15,Sent,1,Outpatient,3019681166,,352603735,Medical Center,CYTO THIN PREP SCREEN 88142/31100068,,"Release,Final",mim55071,"Jones, John. (021675)"
0
 
Patrick MatthewsCommented:
vpnsol123,

OK, there were a few syntax errors there, but this seems to work.  Some notes:

1) The lower bound for all arrays in VBScript is zero.  Thus, the 33rd column is referred to as 32
2) On that basis, I changed the reference to the 13th column to 12 for testing the <>0 condition



Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine, Mats

OutputCols = Array(7, 8, 12, 13, 15, 25, 31, 32)

Set RegX = New RegExp
With RegX
    .Pattern = """[^,]*,.*"""
    .Global = False
End With
Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("c:\test.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)

Do Until tsIn.AtEndOfStream
    TheLine = tsIn.ReadLine
    Test = TheLine
    Do
        Test = RegX.Replace(Test, " ")
        Set Mats = RegX.Execute(Test)
        If Mats.Count = 0 Then Exit Do
    Loop
    LineArr = Split(Test, ",")
    If CDbl(LineArr(12)) <> 0 Then
        NewLine = ""
        For OutputCol = LBound(OutputCols) To UBound(OutputCols)
            NewLine = NewLine & "," & LineArr(OutputCol)
        Next
        tsOut.WriteLine Mid(NewLine, 2)
    End If
Loop

tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegExp = Nothing

MsgBox "Done"


Patrick
0
 
vpnsol123Author Commented:
I see what you changed here.  

For OutputCol = LBound(OutputCols) To UBound(OutputCols)

But in this it is taking columns 0 through 6

Is that because 0 is the LBound and 6 is the UBound?
0
 
Patrick MatthewsCommented:
Sorry, dumb error on my part.




Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine, Mats

OutputCols = Array(7, 8, 12, 13, 15, 25, 31, 32)

Set RegX = New RegExp
With RegX
    .Pattern = """[^,]*,.*"""
    .Global = False
End With
Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("c:\test.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)

Do Until tsIn.AtEndOfStream
    TheLine = tsIn.ReadLine
    Test = TheLine
    Do
        Test = RegX.Replace(Test, " ")
        Set Mats = RegX.Execute(Test)
        If Mats.Count = 0 Then Exit Do
    Loop
    LineArr = Split(Test, ",")
    If CDbl(LineArr(12)) <> 0 Then
        NewLine = ""
        For OutputCol = LBound(OutputCols) To UBound(OutputCols)
            NewLine = NewLine & "," & LineArr(OutputCols(OutputCol))
        Next
        tsOut.WriteLine Mid(NewLine, 2)
    End If
Loop

tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegExp = Nothing

MsgBox "Done"
0
 
vpnsol123Author Commented:
Sorry to be a pain.  Same Subscript  Out Of Range Error.
0
 
Patrick MatthewsCommented:
vpnsol123,

Any feedback?

Patrick
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.