# Remove string in VB Script

Hello,

I have a test.csv file with the following information:

A1;B1;C1
1111;ID345;475758
4566;Z_ID4758;5768595
7568;z1_ID45;4657576
45676;zxxx_ID48;4657576

I need a Vb Script to remove extra characters of values of column B if its begin with z or Z. All values of column B need to start with the char "ID".
The len of values of column B can be different.

Here the final result of test2.csv
A1;B1;C1
1111;ID345;475758
4566;ID4758;5768595
7568;ID45;4657576
45676;ID48;4657576

Thank you in advance for your help.
LVL 1
###### Who is Participating?

Commented:
Oh, I see what you're saying.  Assuming that all of the keys have ID in them somewhere, this would strip anything before ID, and also add ID to those that do not have it:
strFile1 = "C:\Temp\Scripts\EE\File1_Revised.txt"
strOutput = "C:\Temp\Scripts\EE\File1_Revised_Filtered.txt"

' Create the output file
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objOutput = objFSO.CreateTextFile(strOutput, True)

' Read through the file to omit invalid rows
Set objFile1 = objFSO.OpenTextFile(strFile1, 1, False)
While Not objFile1.AtEndOfStream
arrValues = Split(strLine, ";")
If InStr(arrValues(1), "ID") > 0 Then
arrValues(1) = Mid(arrValues(1), InStr(arrValues(1), "ID"))
Else
arrValues(1) = "ID" & arrValues(1)
End If
objOutput.WriteLine Join(arrValues, ";")
Wend
objFile1.Close

' Close the output file
objOutput.Close

WScript.Echo "Done"


Rob.
0

Commented:
Hi there,

Give this VBS a try.

Regards,

Rob.

strFile1 = "C:\Temp\Scripts\EE\File1_Revised.txt"
strOutput = "C:\Temp\Scripts\EE\File1_Revised_Filtered.txt"

' Create the output file
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objOutput = objFSO.CreateTextFile(strOutput, True)

' Read through the file to omit invalid rows
Set objFile1 = objFSO.OpenTextFile(strFile1, 1, False)
While Not objFile1.AtEndOfStream
strKey = Split(strLine, ";")(1)
If Left(strKey, 2) = "ID" Then objOutput.WriteLine strLine
Wend
objFile1.Close

' Close the output file
objOutput.Close

WScript.Echo "Done"

0

Developer AnalystCommented:
here is a way using a RegEx.  this may enable you to change the rules slightly if needed.

Option Explicit

Dim objRE
Dim strInput
Dim colMatches
Dim objMatch
Dim arrInput
Dim strOutput
Dim i
Dim objFSO
Dim objTextFileOut
Dim objTextFileIn
Dim strNextLine

Set objRE = New RegExp
objRE.Global     = True
objRE.IgnoreCase = False
objRE.Pattern    = "(.*)(ID\d{1,})"

'A1;B1;C1
'1111;ID345;475758
'4566;Z_ID4758;5768595
'7568;z1_ID45;4657576
'45676;zxxx_ID48;4657576

Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFileOut = objFSO.OpenTextFile ("c:\data\out.txt", ForWriting, True)

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFileIn = objFSO.OpenTextFile  ("c:\data\in.txt", ForReading)

Do Until objTextFileIn.AtEndOfStream

arrInput = Split(strNextLine,";")

Set colMatches = objRE.Execute(arrInput(1))

If colMatches.Count>0 then
For Each objMatch In colMatches
arrInput(1)= objmatch.Submatches(1)
Next
End If
strOutput = ""
For i = 0 To Ubound(arrInput)
strOutput = strOutput & arrInput(i) & ";"
Next

strOutput = Mid(strOutput,1,Len(strOutput)-1)

WScript.Echo strOutput
objTextFileOut.WriteLine(strOutput)

Loop

objTextFileIn.Close
objTextFileOut.Close

0

Author Commented:
Hello Rob,

I will test this tomorrow, however I see in your code

If Left(strKey, 2) = "ID" Then objOutput.WriteLine strLine

Are you totally dropping the lines which do not begin with ID?
This is ok, however I need to clean the values, which do not begin with ID and also WriteLine with the revised value which begin with ID.
0

Developer AnalystCommented:
Your specs state if column B starts with z or Z.

I'm guessing you have rows, that have an ID that do not start with Z?

example In

A1;B1;C1
1111;ID345;475758
4566;Z_ID4758;5768595
568;x1_ID45;46576
7568;z1_ID45;4657576
45676;zxxx_ID48;4657576


example out

A1;B1;C1
1111;ID345;475758
4566;ID4758;5768595
568;x1_ID45;46576
7568;ID45;4657576
45676;ID48;4657576


revised code to also follow the rule for z or Z

Option Explicit

Dim objRE
Dim strInput
Dim colMatches
Dim objMatch
Dim arrInput
Dim strOutput
Dim i
Dim objFSO
Dim objTextFileOut
Dim objTextFileIn
Dim strNextLine

Set objRE = New RegExp
objRE.Global     = True
objRE.IgnoreCase = False
objRE.Pattern    = "[z|Z](.*)(ID\d{1,})"

Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFileOut = objFSO.OpenTextFile ("c:\data\out.txt", ForWriting, True)

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFileIn = objFSO.OpenTextFile  ("c:\data\in.txt", ForReading)

Do Until objTextFileIn.AtEndOfStream

arrInput = Split(strNextLine,";")

Set colMatches = objRE.Execute(arrInput(1))

If colMatches.Count>0 then
For Each objMatch In colMatches
arrInput(1)= objmatch.Submatches(1)
Next
End If
strOutput = ""
For i = 0 To Ubound(arrInput)
strOutput = strOutput & arrInput(i) & ";"
Next

strOutput = Mid(strOutput,1,Len(strOutput)-1)

WScript.Echo strOutput
objTextFileOut.WriteLine(strOutput)

Loop

objTextFileIn.Close
objTextFileOut.Close

0

Commented:
Here's a relatively strait forward REGEX replacement approach.

As always, run with one or two file parks:

cscript EE28618822.vbs in.csv out.csv

or

cscript EE28618822.vbs in.csv

the second overwrites the file with the changes.

' Define needed constants
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2

' Get input file name from command line parm, if 2 parms entered
' use second as new output file, else rewrite to input file
If (WScript.Arguments.Count > 0) Then
sInfile = WScript.Arguments(0)
Else
WScript.Echo "No filename specified."
WScript.Quit
End If
If (WScript.Arguments.Count > 1) Then
sOutfile = WScript.Arguments(1)
Else
sOutfile = sInfile
End If

' Create file system object
Set oFSO = CreateObject("Scripting.FileSystemObject")

' Read entire input file into a variable and close it
Set oInfile = oFSO.OpenTextFile(sInfile, ForReading, False, TriStateUseDefault)
oInfile.Close
Set oInfile = Nothing

' Replace desired strings
Set oRegExp = New RegExp
oRegExp.Global = True
oRegExp.IgnoreCase = False
oRegExp.Pattern = "(;)([zZ].*)(ID)"
sData = oRegExp.Replace(sData, "$1$3")

' Write file with any changes made
Set oOutfile = oFSO.OpenTextFile(sOutfile, ForWriting, True)
oOutfile.Write(sData)
oOutfile.Close
Set oOutfile = Nothing

' Cleanup and end
Set oFSO = Nothing
Wscript.Quit

~bp
0

Commented:
The simpler regex replacement pattern is this
;(.*?)(ID[^;]*);


Replace with
;\$2;

0

Commented:
@aikimark,

I always enjoy seeing alternate REGEX expressions, thanks for adding.

I thought the OP only wanted to edit that column if it started with z or Z though, wouldn't you need to add that?

~bp
0

Commented:
The subsequent comments led me to think that anything between the ; and the ID should be removed.  If I've misread/misinterpreted the requirement then ignore my comment.

Also, if you set the regexp object to ignore case, your patterns might be simpler.
0

Commented:
I guess another alternate could be this, although I lean toward soft coding the result string rather than recoding the search string surrounds, easier to maintain in the future, only one change.  A personal preference...

oRegExp.Pattern = ";([zZ].*)ID"
sData = oRegExp.Replace(sData, ";ID")

~bp
0

Commented:
@LD16,

Need some further clarification from you.  Do you only want to do this for the fields with a z or Z at the beginning, or any field that has extra characters before the "ID".

Also, if a field does not have either a z/Z or an ID at the beginning, are you saying those need an ID inserted?

~bp
0

Commented:
And this pattern would probably be more efficient, since it only replaces if there are one or more characters before the ID
;(.+?)(ID[^;]*);


Maybe I should test these two methods like I did in my most recent article.  However, performance probably isn't a big concern in this instance.
0

Commented:
The subsequent comments led me to think that anything between the ; and the ID should be removed.

That was the assumption I made as well, since the author stated that each field should start with ID.

Rob.
0

Author Commented:
@Bill,

I made a mistake in my spec all the values begins with z or Z so  "ID" insertion is not required.

Thank you
0

Commented:
Okay, great, let me know how you make out with my solution or others, hopefully we have some things that canget the job done.

~bp
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.