Link to home
Start Free TrialLog in
Avatar of elwayisgod
elwayisgodFlag for United States of America

asked on

VB Script - Find and Replace in pipe delimited file

Trying to get our source files in best possible shape so they load to database faster.  I have a text file that is pipe delimited.  It has 14 fields.  First 4 are metadata and last 10 are data. Here's what I need.

For each field:

Field 1 - Nothing - Stays as is
Field 2 - Change 'Fiscal Rcpt' to 'Fiscal' and 'Fiscal Sales' to 'Fiscal'.  So 'Fiscal Sales Aug' would  be 'Fiscal Aug' and so on.
Field 3 - Nothing - Stays as is
Field 4 - Change '2012' to '2012_New'
Field 5 to Field 14 - Change '0' to #Missing and/or '0.00' to #Missing.  These have to be exact matches only.  So 50 cant become 5#Missing.  Only '0' and '0.00' can be changed to #Missing.

This has to run as fast as possible as this is whole point as we are bumping up against SLA already.  I have a bunch more of these so some way where I can change a few things around for a different file would be helpful too.

So a way where I can identify the filename to apply this to and then the resulting file would just be prefixed with 'LD_'.   That would be great.

I appreciate all help you gurus have been providing.
Avatar of elwayisgod
elwayisgod
Flag of United States of America image

ASKER

Here's an example of before and after files.
Before.txt
After.txt
Avatar of Patrick Matthews
Try this.  It's VBScript.



Dim fso, tsIn, tsOut, arr, TheLine, Counter

Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("c:\oldfile.txt")
Set tsOut = fso.CreateTextFile("c:\newfile.txt", True)

tsOut.WriteLine tsIn.Readline

Do Until tsIn.AtEndOfStream
    TheLine = tsIn.ReadLine
    If TheLine <> "" Then
        arr = Split(TheLine, "|")
        arr(1) = Replace(Replace(arr(1), "Fiscal Rcpt", "Fiscal"), "Fiscal Sales", "Fiscal")
        If arr(3) = "2012" Then arr(3) = "2012_New"
        For Counter = 4 To 13
            If arr(Counter) = "0" Or arr(Counter) = "0.00" Then arr(Counter) = "#Missing"
        Next
        tsOut.WriteLine Join(arr, "|")
    Else
        tsOut.WriteLine
    End If
Loop

tsIn.Close
tsOut.Close
Set tsIn = Nothing
Set tsOut = Nothing
Set fso = Nothing

Open in new window

This looks really good.  You want to talk offline about side work.  Have a many (10) more.
took the long way around but added a dialog box and the ability to configure each field


Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set ObjCDG = CreateObject("UserAccounts.CommonDialog")

ObjCDG.Filter = "Text Documents|*.txt|All Files|*.*"
ObjCDG.FilterIndex = 3
ObjCDG.InitialDir = "C:\temp\Test\"
InitCDG = ObjCDG.ShowOpen

If InitCDG = False Then
    Wscript.Echo "Script Error: Please select a file!"
    Wscript.Quit
End If

Set objFileIN = objFSO.OpenTextFile(ObjCDG.FileName, ForReading)
Set objFileOUT = objFSO.OpenTextFile(ObjCDG.InitialDir & "LD_" & objFSO.GetFileName(ObjCDG.FileName), ForWriting, True)
	

Do Until objFileIN.AtEndOfStream
	strText = objFileIN.ReadLine
	aData = Split(strText,"|")

	' mdse_sty   Field 1 - Nothing - Stays as Is
	strNewText = aData(0) & "|"
	
	' fis_sls_mo  Field 2 - Change 'Fiscal Rcpt' to 'Fiscal' and 'Fiscal Sales' to 'Fiscal'.
	If instr(1, adata(1), "Fiscal Rcpt") > 0 Then
		strNewText = strNewText & Replace(adata(1), "Fiscal Rcpt", "Fiscal") & "|"
	ElseIf instr(1, adata(1), "Fiscal Sales") > 0 Then
		strNewText = strNewText & Replace(adata(1), "Fiscal Sales", "Fiscal") & "|"	
	End If	
		
	' mdse_typ   Field 3 - Nothing - Stays as Is
	strNewText = strNewText & aData(2) & "|"
	
	' fis_sls_yr   Field 4 - Change '2012' to '2012_New'
	If adata(3)= "2012" Then
		strNewText = strNewText & Replace(adata(3), "2012", "2012_New") & "|"
	Else	
		strNewText = strNewText & aData(3) & "|"
	End If
	
	'Field 5 to Field 14 - Change '0' to #Missing and/or '0.00' to #Missing. 
	' unit_sls_ttl
	If adata(4) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(4) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(4) & "|"
	End If	
	
	' rtl_sls_ttl_local
	If adata(5) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(5) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(5) & "|"
	End If	
		
	' unit_sls_regpromo
	If adata(6) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(6) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(6) & "|"
	End If	
		
	' rtl_sls_regpromo_local
	If adata(7) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(7) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(7) & "|"
	End If	
		
	' unit_sls_mkd
	If adata(8) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(8) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(8) & "|"
	End If	
		
	' rtl_sls_mkd_local
	If adata(9) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(9) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(9) & "|"
	End If	
		
	' cst_sls_ttl_local
	If adata(10) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(10) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(10) & "|"
	End If	
		
	' cst_sls_regpromo_local
	If adata(11) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(11) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(11) & "|"
	End If	
		
	' cst_sls_mkd_local
	If adata(12) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(12) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(12) & "|"
	End If	
		
	' rec_order
	If adata(13) = "0" Then
		strNewText = strNewText & "#Missing" & "|"
	ElseIf adata(13) = "0.00" Then
		strNewText = strNewText & "#Missing" & "|"
	Else
		strNewText = strNewText & adata(13) & "|"
	End If	
		
	
	objFileOUT.WriteLine strNewText

Loop

objFileIN.Close
objFileOUT.Close

Open in new window

In the same script is it possible to add this logic too, but the output file is a different name.  Field 2 and 4 slightly different.  I can make it a different question too.

Source File is same file:

Field 1 - Nothing - Stays as is
Field 2 - Change 'Fiscal Sales Dec' to 'LY Fiscal Dec' and 'Fiscal Sales Jan' to 'LY Fiscal Jan'.  
Field 3 - Nothing - Stays as is
Field 4 - Increase each value by 1 except 2012. Have to start at 2018 and work way back too, I think :)

2018 is 2019
2017 is 2018
2016 is 2017
2015 is 2016
2014 is 2015
2013 is 2014
2012 is 2013_New
2011 is 2012
2010 is 2011
2009 is 2010
2008 is 2009
2007 is 2008
2006 is 2007
2005 is 2006
2004 is 2005
2003 is 2004
 

Field 5 to Field 14 - Change '0' to #Missing and/or '0.00' to #Missing.  These have to be exact matches only.  So 50 cant become 5#Missing.  Only '0' and '0.00' can be changed to #Missing.
Shorvath.  This has to run without any input from a user.  It will be a scheduled task.
OK, so how will the original files be remove from the folder?  Sholud the script put the 'LD_' prefix in front of the new file and delete the old one?
ASKER CERTIFIED SOLUTION
Avatar of Patrick Matthews
Patrick Matthews
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
no.  Original files stay there.  Scripts just create two new files. in same directory.  Mathews script is very fast and I like it alot.  Just needs tweaked a tad
try this

Const ForReading = 1
Const ForWriting = 2

Set objFSO = CreateObject("Scripting.FileSystemObject")


sFolder = Wscript.Arguments.Item(0)
If sFolder = "" Then
    Wscript.Echo "No Folder parameter was passed"
    Wscript.Quit
End If      
Set folder = objFSO.GetFolder(sFolder)
Set files = folder.Files
 
 

 
For each folderIdx In files

 
      Set objFileIN = objFSO.OpenTextFile(sFolder & objFSO.GetFileName(folderIdx.Name), ForReading)
      Set objFileOUT = objFSO.OpenTextFile(sFolder & "LD_" & objFSO.GetFileName(folderIdx.Name), ForWriting, True)
            
      
      Do Until objFileIN.AtEndOfStream
            strText = objFileIN.ReadLine
            aData = Split(strText,"|")
      
            ' mdse_sty   Field 1 - Nothing - Stays as Is
            strNewText = aData(0) & "|"
            
            ' fis_sls_mo  Field 2 - Change 'Fiscal Rcpt' to 'Fiscal' and 'Fiscal Sales' to 'Fiscal'.
            If instr(1, adata(1), "Fiscal Rcpt") > 0 Then
                  strNewText = strNewText & Replace(adata(1), "Fiscal Rcpt", "Fiscal") & "|"
            ElseIf instr(1, adata(1), "Fiscal Sales") > 0 Then
                  strNewText = strNewText & Replace(adata(1), "Fiscal Sales", "Fiscal") & "|"      
            ElseIf instr(1, adata(1), "Fiscal Sales Dec") > 0 Then
                  strNewText = strNewText & Replace(adata(1), "LY Fiscal Dec", "Fiscal") & "|"      
            ElseIf instr(1, adata(1), "Fiscal Sales Jan") > 0 Then
                  strNewText = strNewText & Replace(adata(1), "LY Fiscal Jan", "Fiscal") & "|"      
            Else
                  strNewText = strNewText & adata(1) & "|"      
            End If      
                  
            ' mdse_typ   Field 3 - Nothing - Stays as Is
            strNewText = strNewText & aData(2) & "|"
            
            ' fis_sls_yr   Field 4 - Change '2012' to '2012_New'
            If IsNumeric(adata(3)) = True Then
                  If adata(3)= "2012" Then
                        strNewText = strNewText & Replace(adata(3), "2012", "2012_New") & "|"
                  Else      
                        strNewText = strNewText & cstr( cint(aData(3) + 1)) & "|"
                  End If
            Else
                  strNewText = strNewText & aData(3) & "|"
            End If
            
            'Field 5 to Field 14 - Change '0' to #Missing and/or '0.00' to #Missing.
            ' unit_sls_ttl
            If adata(4) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(4) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(4) & "|"
            End If      
            
            ' rtl_sls_ttl_local
            If adata(5) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(5) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(5) & "|"
            End If      
                  
            ' unit_sls_regpromo
            If adata(6) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(6) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(6) & "|"
            End If      
                  
            ' rtl_sls_regpromo_local
            If adata(7) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(7) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(7) & "|"
            End If      
                  
            ' unit_sls_mkd
            If adata(8) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(8) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(8) & "|"
            End If      
                  
            ' rtl_sls_mkd_local
            If adata(9) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(9) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(9) & "|"
            End If      
                  
            ' cst_sls_ttl_local
            If adata(10) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(10) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(10) & "|"
            End If      
                  
            ' cst_sls_regpromo_local
            If adata(11) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(11) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(11) & "|"
            End If      
                  
            ' cst_sls_mkd_local
            If adata(12) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(12) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(12) & "|"
            End If      
                  
            ' rec_order
            If adata(13) = "0" Then
                  strNewText = strNewText & "#Missing" & "|"
            ElseIf adata(13) = "0.00" Then
                  strNewText = strNewText & "#Missing" & "|"
            Else
                  strNewText = strNewText & adata(13) & "|"
            End If      
                  
            
            objFileOUT.WriteLine strNewText
      
      Loop
      
      objFileIN.Close
      objFileOUT.Close
      
      'objFSO.DeleteFile(folderIdx.Name, True
Next
MathewPatrick,

This is looking really sweet.  Only two things left.. Sorry, as I forgot about these.

1 - There is a possibility that the Year Field 4 is 'Other Yr'.  If that is found then it stays the same in both output files.
2 - In output file #2, I actually need each line that is '2013' to create two duplicate lines.  One with the Year increasing by 1 to '2014' and one duplicate with Year equal to '2013_New'.

Is that possible?
And on the when it's looking for a value to change can it not be case sensitive?  It didn't change 'Fiscal Rcpt Dec' when source file was 'Fiscal Rcpt DEC'.  Not sure how case will come in and whether it's consistent or not.  These are files with millions of rows so no way to check.