vpnsol123
asked on
VBScipt to remove lines with zero value
I have a .csv file. In column 14 of the file many of the lines have a value of "0". I need to script to look at each line and if the value in column 14 on that line is 0, delete it then write the remaining lines to a new file.
ASKER
It is preferable that the originial file remain in tact and a new file be created for the non-zero value lines.
Also, yes we can safely split on comma.
Thanks
Also, yes we can safely split on comma.
Thanks
Dim fso, tsIn, tsOut, TheLine, LineArr
Set fso = CreateObject("Scripting.Fi leSystemOb ject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
LineArr = Split(TheLine, ",")
If LineArr(13) <> 0 Then tsOut.WriteLine TheLine
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
MsgBox "Done"
Set fso = CreateObject("Scripting.Fi
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
LineArr = Split(TheLine, ",")
If LineArr(13) <> 0 Then tsOut.WriteLine TheLine
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
MsgBox "Done"
If we had to be wary of fields that included commas, assuming doublequote is the text qualifier...
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX
Set RegX = New RegExp
RegX.Pattern = """[^""]*"""
Set fso = CreateObject("Scripting.Fi leSystemOb ject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = RegX.Replace(TheLine, "")
LineArr = Split(Test, ",")
If LineArr(13) <> 0 Then tsOut.WriteLine TheLine
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set RegExp = Nothing
MsgBox "Done"
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX
Set RegX = New RegExp
RegX.Pattern = """[^""]*"""
Set fso = CreateObject("Scripting.Fi
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = RegX.Replace(TheLine, "")
LineArr = Split(Test, ",")
If LineArr(13) <> 0 Then tsOut.WriteLine TheLine
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set RegExp = Nothing
MsgBox "Done"
Looks like matthewspatrick has already given you a couple of great answers... =)
Need a small tweak to the "have to be wary of internal commas" version, as the Global property for RegExp
is False by default:
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX
Set RegX = New RegExp
With RegX
.Pattern = """[^""]*"""
.Global = True
End With
Set fso = CreateObject("Scripting.Fi leSystemOb ject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = RegX.Replace(TheLine, "")
LineArr = Split(Test, ",")
If LineArr(13) <> 0 Then tsOut.WriteLine TheLine
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set RegExp = Nothing
MsgBox "Done"
is False by default:
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX
Set RegX = New RegExp
With RegX
.Pattern = """[^""]*"""
.Global = True
End With
Set fso = CreateObject("Scripting.Fi
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = RegX.Replace(TheLine, "")
LineArr = Split(Test, ",")
If LineArr(13) <> 0 Then tsOut.WriteLine TheLine
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set RegExp = Nothing
MsgBox "Done"
ASKER
This is perfect. Now to complicate it. I am finding that the program that I am importing this into is a bit complicated. The above logic works perfectly. Is there an ability to have the VBScript output only fields pecified in the script. In other words, if I only want to output columns 5,7,9,11,13 and 15 into the output file, how would I do that?
Thank you again.
Thank you again.
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine
OutputCols = Array(5, 7, 9, 11, 13, 15)
Set RegX = New RegExp
With RegX
.Pattern = """[^""]*"""
.Global = True
End With
Set fso = CreateObject("Scripting.Fi leSystemOb ject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = RegX.Replace(TheLine, "")
LineArr = Split(Test, ",")
If LineArr(13) <> 0 Then
NewLine = ""
For Each OutputCol In OutputCols
NewLine = NewLine & "," & LineArr(OutputCol)
Next
tsOut.WriteLine Mid(NewLine, 2)
End If
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set RegExp = Nothing
MsgBox "Done"
OutputCols = Array(5, 7, 9, 11, 13, 15)
Set RegX = New RegExp
With RegX
.Pattern = """[^""]*"""
.Global = True
End With
Set fso = CreateObject("Scripting.Fi
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = RegX.Replace(TheLine, "")
LineArr = Split(Test, ",")
If LineArr(13) <> 0 Then
NewLine = ""
For Each OutputCol In OutputCols
NewLine = NewLine & "," & LineArr(OutputCol)
Next
tsOut.WriteLine Mid(NewLine, 2)
End If
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set RegExp = Nothing
MsgBox "Done"
ASKER
This works very well. The only problem that I am having now is that column 31 contains a comma in it. It appeared as though the logic was there to split that cell but that column is actually messing up my output. Any thoughts?
ASKER
Ok...I see the problem. The script in its current form is searching for the comma within a column and then blanking out that column. What I need it to do it to split the column into 2 and then to output those values or to just replace the comma with a space.
Try this revised code. It tries to replace internal commas with spaces.
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine, Mats
OutputCols = Array(5, 7, 9, 11, 13, 15)
Set RegX = New RegExp
With RegX
.Pattern = """[^,]*,.*"""
.Global = False
End With
Set fso = CreateObject("Scripting.Fi leSystemOb ject")
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = TheLine
Do
Test = RegX.Replace(Test, " ")
Set Mats = RegX.Execute(Test)
If Mats.Count = 0 Then Exit Do
Loop
LineArr = Split(Test, ",")
If LineArr(13) <> 0 Then
NewLine = ""
For Each OutputCol In OutputCols
NewLine = NewLine & "," & LineArr(OutputCol)
Next
tsOut.WriteLine Mid(NewLine, 2)
End If
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegExp = Nothing
MsgBox "Done"
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine, Mats
OutputCols = Array(5, 7, 9, 11, 13, 15)
Set RegX = New RegExp
With RegX
.Pattern = """[^,]*,.*"""
.Global = False
End With
Set fso = CreateObject("Scripting.Fi
Set tsIn = fso.OpenTextFile("c:\the file.csv")
Set tsOut = fso.CreateTextFile("c:\new
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = TheLine
Do
Test = RegX.Replace(Test, " ")
Set Mats = RegX.Execute(Test)
If Mats.Count = 0 Then Exit Do
Loop
LineArr = Split(Test, ",")
If LineArr(13) <> 0 Then
NewLine = ""
For Each OutputCol In OutputCols
NewLine = NewLine & "," & LineArr(OutputCol)
Next
tsOut.WriteLine Mid(NewLine, 2)
End If
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegExp = Nothing
MsgBox "Done"
ASKER
That seems to do the trick with the comma's but it also seems to have broken the part of the code that deletes the lines with a value of 0 in column 14.
Thanks
Thanks
ASKER
I am getting subscript out of rage:'OutputCol' when I run this.
My output columns look like this:Array(7, 8, 12, 13, 15, 25, 31, 33)
My output columns look like this:Array(7, 8, 12, 13, 15, 25, 31, 33)
vpnsol123,
It would be useful to see some sample data and/or a sample file. EE now allows you to directly upload files
to your question.
Please be advised that once you upload a file, it can be publicly accessed, and that it may not be possible
to fully and permanently delete it. Therefore, be very careful about posting proprietary, confidential, or
other sensitive information. If necessary, use "fake" and/or obfuscated data in your sample.
Please note that at present EE restricts uploads to certain file types. If your file type does not match
those in the list, you can use http://www.ee-stuff.com instead, which is not officially an EE site, but is run
by people connected to EE.
Patrick
It would be useful to see some sample data and/or a sample file. EE now allows you to directly upload files
to your question.
Please be advised that once you upload a file, it can be publicly accessed, and that it may not be possible
to fully and permanently delete it. Therefore, be very careful about posting proprietary, confidential, or
other sensitive information. If necessary, use "fake" and/or obfuscated data in your sample.
Please note that at present EE restricts uploads to certain file types. If your file type does not match
those in the list, you can use http://www.ee-stuff.com instead, which is not officially an EE site, but is run
by people connected to EE.
Patrick
ASKER
Below is a single sample record. I can not upload a file. It is patient sensitive so I creasted a sample.. You will see that the data is separated by commas. The comma between the words Release and final is the one that I am trying to remove.
Outpatient Technical,N,Y,3127,11/06/2 009 17:41:00,9,DG09-75588,Smit h John,,,mim4681961,11/05/20 09 04:24:00,00000,ANP31100068 ,1,,,N,N,1 1/06/2009 17:42:07:246,11/06/2009 17:42:15,Sent,1,Outpatient ,301968116 6,,3526037 35,Medical Center,CYTO THIN PREP SCREEN 88142/31100068,,"Release,F inal",mim5 5071,"Jone s, John. (021675)"
Outpatient Technical,N,Y,3127,11/06/2
vpnsol123,
OK, there were a few syntax errors there, but this seems to work. Some notes:
1) The lower bound for all arrays in VBScript is zero. Thus, the 33rd column is referred to as 32
2) On that basis, I changed the reference to the 13th column to 12 for testing the <>0 condition
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine, Mats
OutputCols = Array(7, 8, 12, 13, 15, 25, 31, 32)
Set RegX = New RegExp
With RegX
.Pattern = """[^,]*,.*"""
.Global = False
End With
Set fso = CreateObject("Scripting.Fi leSystemOb ject")
Set tsIn = fso.OpenTextFile("c:\test. csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = TheLine
Do
Test = RegX.Replace(Test, " ")
Set Mats = RegX.Execute(Test)
If Mats.Count = 0 Then Exit Do
Loop
LineArr = Split(Test, ",")
If CDbl(LineArr(12)) <> 0 Then
NewLine = ""
For OutputCol = LBound(OutputCols) To UBound(OutputCols)
NewLine = NewLine & "," & LineArr(OutputCol)
Next
tsOut.WriteLine Mid(NewLine, 2)
End If
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegExp = Nothing
MsgBox "Done"
Patrick
OK, there were a few syntax errors there, but this seems to work. Some notes:
1) The lower bound for all arrays in VBScript is zero. Thus, the 33rd column is referred to as 32
2) On that basis, I changed the reference to the 13th column to 12 for testing the <>0 condition
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine, Mats
OutputCols = Array(7, 8, 12, 13, 15, 25, 31, 32)
Set RegX = New RegExp
With RegX
.Pattern = """[^,]*,.*"""
.Global = False
End With
Set fso = CreateObject("Scripting.Fi
Set tsIn = fso.OpenTextFile("c:\test.
Set tsOut = fso.CreateTextFile("c:\new
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = TheLine
Do
Test = RegX.Replace(Test, " ")
Set Mats = RegX.Execute(Test)
If Mats.Count = 0 Then Exit Do
Loop
LineArr = Split(Test, ",")
If CDbl(LineArr(12)) <> 0 Then
NewLine = ""
For OutputCol = LBound(OutputCols) To UBound(OutputCols)
NewLine = NewLine & "," & LineArr(OutputCol)
Next
tsOut.WriteLine Mid(NewLine, 2)
End If
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegExp = Nothing
MsgBox "Done"
Patrick
ASKER
I see what you changed here.
For OutputCol = LBound(OutputCols) To UBound(OutputCols)
But in this it is taking columns 0 through 6
Is that because 0 is the LBound and 6 is the UBound?
For OutputCol = LBound(OutputCols) To UBound(OutputCols)
But in this it is taking columns 0 through 6
Is that because 0 is the LBound and 6 is the UBound?
Sorry, dumb error on my part.
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine, Mats
OutputCols = Array(7, 8, 12, 13, 15, 25, 31, 32)
Set RegX = New RegExp
With RegX
.Pattern = """[^,]*,.*"""
.Global = False
End With
Set fso = CreateObject("Scripting.Fi leSystemOb ject")
Set tsIn = fso.OpenTextFile("c:\test. csv")
Set tsOut = fso.CreateTextFile("c:\new file.csv", True)
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = TheLine
Do
Test = RegX.Replace(Test, " ")
Set Mats = RegX.Execute(Test)
If Mats.Count = 0 Then Exit Do
Loop
LineArr = Split(Test, ",")
If CDbl(LineArr(12)) <> 0 Then
NewLine = ""
For OutputCol = LBound(OutputCols) To UBound(OutputCols)
NewLine = NewLine & "," & LineArr(OutputCols(OutputC ol))
Next
tsOut.WriteLine Mid(NewLine, 2)
End If
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegExp = Nothing
MsgBox "Done"
Dim fso, tsIn, tsOut, TheLine, Test, LineArr, RegX, OutputCols, OutCol, NewLine, Mats
OutputCols = Array(7, 8, 12, 13, 15, 25, 31, 32)
Set RegX = New RegExp
With RegX
.Pattern = """[^,]*,.*"""
.Global = False
End With
Set fso = CreateObject("Scripting.Fi
Set tsIn = fso.OpenTextFile("c:\test.
Set tsOut = fso.CreateTextFile("c:\new
Do Until tsIn.AtEndOfStream
TheLine = tsIn.ReadLine
Test = TheLine
Do
Test = RegX.Replace(Test, " ")
Set Mats = RegX.Execute(Test)
If Mats.Count = 0 Then Exit Do
Loop
LineArr = Split(Test, ",")
If CDbl(LineArr(12)) <> 0 Then
NewLine = ""
For OutputCol = LBound(OutputCols) To UBound(OutputCols)
NewLine = NewLine & "," & LineArr(OutputCols(OutputC
Next
tsOut.WriteLine Mid(NewLine, 2)
End If
Loop
tsIn.Close
Set tsIn = Nothing
tsOut.Close
Set tsOut = Nothing
Set fso = Nothing
Set Mats = Nothing
Set RegExp = Nothing
MsgBox "Done"
ASKER
Sorry to be a pain. Same Subscript Out Of Range Error.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
vpnsol123,
Any feedback?
Patrick
Any feedback?
Patrick
Could please clarify...
Do you want the original file to remain UNTOUCHED or not?
The new file should only have lines where Column 14 is NOT zero?
Also, do any of your values have a comma INSIDE the quotation marks?...or can we safely Split() on comma?