Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 176
  • Last Modified:

Remove lines with Vb script

Hello,

I have two csv files:
file1.csv with the following information:

1111;A8;5555
3455;B7;6778
5767;C3;6770
2222;A8;6666

file2.csv with the following information

C4;aaaa
A8;cccc
C3;dddd

I need a vbscript which reads both files and create a file1_revised.csv without lines which are not in column A of file2.

Final result of file1_revised.csv:
1111;A8;5555
5767;C3;6770
2222;A8;6666
0
LD16
Asked:
LD16
  • 11
  • 6
  • 5
  • +1
3 Solutions
 
aikimarkCommented:
Please test this:
Sub Q_28618048()
    Const ForReading = 1, ForWriting = 2, ForAppending = 3
    Dim oFS, oTS
    Dim strFile1Data, strFile2Data
    Dim vParsedData, vParsedLine, vItem
    Dim dicUnique
    Set oFS = CreateObject("Scripting.FileSystemObject")
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1.csv", ForReading, False)
    strFile1Data = oTS.readall
    oTS.Close
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file2.csv", ForReading, False)
    strFile2Data = oTS.readall
    oTS.Close
    
    Set dicUnique = CreateObject("Scripting.dictionary")
    
    Dim oRE As Object
    Dim oMatches As Object
    Dim oM As Object
    Dim oSM As Object
    Dim lngSM As Long

    Set oRE = CreateObject("vbscript.regexp")
    oRE.Global = True
    oRE.Pattern = "([^;]*);.*(?:\r\n|$)"

    If oRE.test(strFile2Data) Then
        Set oMatches = oRE.Execute(strFile2Data)
        For Each oM In oMatches
            With oM
                If dicUnique.exists(.submatches(0)) Then
                Else
                    dicUnique.Add .submatches(0), 1
                End If
            End With
        Next
    End If
    
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1_revised.csv ", ForWriting, True)
    vParsedData = Split(strFile1Data, vbCrLf)
    For Each vItem In vParsedData
        vParsedLine = Split(vItem, ";")
        If dicUnique.exists(vParsedLine(1)) Then
            oTS.writeline vItem
        End If
    Next
    oTS.Close
    
End Sub

Open in new window

0
 
RobSampsonCommented:
Hi, here's a Powershell version, if you're interested:
$file1 = Import-CSV ".\test1.txt" -Header "Col1","Col2","Col3" -Delimiter ";"
$file2 = Import-CSV ".\test2.txt" -Header "Col2","Col3" -Delimiter ";"
$file1 | Where Col2 -In $file2.Col2 | ConvertTo-CSV -Delimiter ";" -NoTypeInformation | Select -Skip 1 | ForEach {$_ -replace '"', ''} | Out-File ".\test1_revised.txt"

Open in new window


Regards,

Rob.
0
 
Bill PrewCommented:
Got an idea for a slightly different approach, hope I can get it to work...

~bp
0
The Firewall Audit Checklist

Preparing for a firewall audit today is almost impossible.
AlgoSec, together with some of the largest global organizations and auditors, has created a checklist to follow when preparing for your firewall audit. Simplify risk mitigation while staying compliant all of the time!

 
Bill PrewCommented:
Okay, here's a different VBS approach.  I used ADO to allow treating the CSV files as "databases tables", and then can do a JOIN of the column you wanted to match on.  Seems to work when I ran it here.

Save as VBS in the folder with the files, adjust file names and path as needed. Also store the INI file below in the same folder and it must be named "schema.ini", it defines the format and columns in each CSV file.

Let me know what questions you have...

' Define needed constants
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2
 
' Define files to work with
strFileIn1 = "file1.csv"
strFileIn2 = "file2.csv"
strFileOut = "out.csv"
strFolder="B:\ee\EE28618048"
 
' Define needed objects 
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")

' Set up ADO to access the CSV files and join on reference column
objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & strFolder & ";Extended Properties='text;HDR=NO;FMT=Delimited'"
objRecordset.Open "SELECT * FROM " & strFileIn1 & ", " & strFileIn2 & " WHERE C21=C12", objConnection

' Open output file
Set objFile = objFSO.OpenTextFile(strFileOut, ForWriting, True)

' Read all records and reformat before writing to output file
Do Until objRecordset.EOF
   objFile.WriteLine objRecordset.Fields("C11").Value & ";" & _
   objRecordset.Fields("C12").Value & ";" & _
   objRecordset.Fields("C13").Value
   objRecordset.MoveNext
Loop

' Wrapup
objFile.Close
objRecordset.Close
objConnection.Close

Open in new window


[file1.csv]
Format=Delimited(;)
ColNameHeader=False
Col1=C11 Text
Col2=C12 Text
Col3=C13 Text

[file2.csv]
Format=Delimited(;)
ColNameHeader=False
Col1=C21 Text
Col2=C22 Text

Open in new window

~bp
0
 
LD16Author Commented:
Hello Bill,

I tried, your script however I have duplicate values on  the out.csv.
Please find bellow the result of out.csv:
1111;A8;5555
1111;A8;5555
2222;A8;6666
2222;A8;6666
5767;C3;6770

And the result should be the following:
1111;A8;5555
2222;A8;6666
5767;C3;6770

Thank you again for your help.
0
 
LD16Author Commented:
Hello Aikimar

I have a  Compilation error: Expected end of statement on line 16 caract 13. Thank you again for your help.
0
 
RobSampsonCommented:
Hi, if the Powershell script isn't going to suit your needs, here is my version of a VBS approach.

Regards,

Rob.

strFile1 = "C:\Temp\Scripts\EE\File1.txt"
strFile2 = "C:\Temp\Scripts\EE\File2.txt"
strOutput = "C:\Temp\Scripts\EE\File1_Revised.txt"

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile2Dct = CreateObject("Scripting.Dictionary")
objFile2Dct.CompareMode = vbTextCompare

' Create the output file
Set objOutput = objFSO.CreateTextFile(strOutput, True)

' First read file 2 to obtain the "key" values
Set objFile2 = objFSO.OpenTextFile(strFile2, 1, False)
While Not objFile2.AtEndOfStream
	strLine = objFile2.ReadLine
	strKey = Split(strLine, ";")(0)
	If objFile2Dct.Exists(strKey) = False Then objFile2Dct.Add strKey, 0
Wend
objFile2.Close

' Then read through the first file and see if the "key" matches those found from the second
Set objFile1 = objFSO.OpenTextFile(strFile1, 1, False)
While Not objFile1.AtEndOfStream
	strLine = objFile1.ReadLine
	strKey = Split(strLine, ";")(1)
	If objFile2Dct.Exists(strKey) = True Then objOutput.WriteLine strLine
Wend
objFile1.Close

' Close the output file
objOutput.Close

WScript.Echo "Done"

Open in new window

0
 
RobSampsonCommented:
For aikimark's to work, you need to remove the "As [Type]" parts of lines 17 through 21, and remove the first and last lines if you are using a pure text VBScript file.

Rob.
0
 
aikimarkCommented:
Oops.  I forgot to remove the data types in the definitions.  Please test this.
Sub Q_28618048()
    Const ForReading = 1, ForWriting = 2, ForAppending = 3
    Dim oFS, oTS
    Dim strFile1Data, strFile2Data
    Dim vParsedData, vParsedLine, vItem
    Dim dicUnique
    Set oFS = CreateObject("Scripting.FileSystemObject")
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1.csv", ForReading, False)
    strFile1Data = oTS.readall
    oTS.Close
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file2.csv", ForReading, False)
    strFile2Data = oTS.readall
    oTS.Close
    
    Set dicUnique = CreateObject("Scripting.dictionary")
    
    Dim oRE 
    Dim oMatches 
    Dim oM 

    Set oRE = CreateObject("vbscript.regexp")
    oRE.Global = True
    oRE.Pattern = "([^;]*);.*(?:\r\n|$)"

    If oRE.test(strFile2Data) Then
        Set oMatches = oRE.Execute(strFile2Data)
        For Each oM In oMatches
            With oM
                If dicUnique.exists(.submatches(0)) Then
                Else
                    dicUnique.Add .submatches(0), 1
                End If
            End With
        Next
    End If
    
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1_revised.csv ", ForWriting, True)
    vParsedData = Split(strFile1Data, vbCrLf)
    For Each vItem In vParsedData
        vParsedLine = Split(vItem, ";")
        If dicUnique.exists(vParsedLine(1)) Then
            oTS.writeline vItem
        End If
    Next
    oTS.Close
    
End Sub

Open in new window

0
 
LD16Author Commented:
Hello,
Your vbs work prefectly. One question came across. How can include the headers of file 1
0
 
LD16Author Commented:
Hello Rob,
Your vbs work prefectly. One question came across. How can include the headers of file 1 in the File1_Revised?

Thank you again for your help.

Regards,
0
 
aikimarkCommented:
This seems to work with headers.  Since your posted data did not include headers, I had to create some to test this revised code.
Sub Q_28618048()
    Const ForReading = 1, ForWriting = 2, ForAppending = 3
    Dim oFS, oTS
    Dim strFile1Data, strFile2Data
    Dim vParsedData, vParsedLine, vItem
    Dim dicUnique
    Set oFS = CreateObject("Scripting.FileSystemObject")
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1.csv", ForReading, False)
    strFile1Data = oTS.readall
    oTS.Close
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file2.csv", ForReading, False)
    strFile2Data = oTS.readall
    oTS.Close
    
    Set dicUnique = CreateObject("Scripting.dictionary")
    
    Dim oRE
    Dim oMatches
    Dim oM

    Set oRE = CreateObject("vbscript.regexp")
    oRE.Global = True
    oRE.Pattern = "([^;]*);.*(?:\r\n|$)"

    If oRE.test(strFile2Data) Then
        Set oMatches = oRE.Execute(strFile2Data)
        For Each oM In oMatches
            With oM
                If dicUnique.exists(.submatches(0)) Then
                Else
                    dicUnique.Add .submatches(0), 1
                End If
            End With
        Next
    End If
    
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1_revised.csv ", ForWriting, True)
    vParsedData = Split(strFile1Data, vbCrLf)
    oTS.writeline vParsedData(0)
    For Each vItem In vParsedData
        vParsedLine = Split(vItem, ";")
        If dicUnique.exists(vParsedLine(1)) Then
            oTS.writeline vItem
        End If
    Next
    oTS.Close
    
End Sub

Open in new window

0
 
LD16Author Commented:
Hello Aikimark,

I was not able to run your revised vbscript any error message however I putted a wcript.Echo and I was unable to display the echo.  I modified the path and create the csv files as mention in my last comment.
Additionnally I tried to launch the script without file1_revised.csv and with a blank version of file1_revised.csv.

Thank you again for your help.

Regards,
0
 
aikimarkCommented:
Since my script is a routine, did you invoke the routine or just try to run the VBS as submitted?

Did you change the path in the string literals?
0
 
LD16Author Commented:
Hello,

I've just tried, to run the vbs.

I've changed just the path of oTS variable.

Could you please let me know how to invoke the routine?

Regards,
0
 
LD16Author Commented:
Hello Rob,

I have included a objOutput.WriteLine for the headers of the revised file.

Thank you very much for your help.

Regards,
0
 
aikimarkCommented:
you just call the routine by name somewhere in your VBS script.

Optionally, you could remove the Sub and End Sub lines.
0
 
Bill PrewCommented:
Here's a small change to remove the duplicates in the approach I shared.

' Define needed constants
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2
 
' Define files to work with
strFileIn1 = "file1.csv"
strFileIn2 = "file2.csv"
strFileOut = "out.csv"
strFolder="B:\ee\EE28618048"
 
' Define needed objects 
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")

' Set up ADO to access the CSV files and join on reference column
objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & strFolder & ";Extended Properties='text;HDR=NO;FMT=Delimited'"
objRecordset.Open "SELECT DISTINCT * FROM " & strFileIn1 & ", " & strFileIn2 & " WHERE C21=C12", objConnection

' Open output file
Set objFile = objFSO.OpenTextFile(strFileOut, ForWriting, True)

' Read all records and reformat before writing to output file
Do Until objRecordset.EOF
   objFile.WriteLine objRecordset.Fields("C11").Value & ";" & _
   objRecordset.Fields("C12").Value & ";" & _
   objRecordset.Fields("C13").Value
   objRecordset.MoveNext
Loop

' Wrapup
objFile.Close
objRecordset.Close
objConnection.Close

Open in new window

~bp
0
 
LD16Author Commented:
Hello,

Both scripts work properly. Thank you very much for help.
0
 
aikimarkCommented:
Both scripts work properly
If both scripts worked, why did you choose only one as the solution?
0
 
LD16Author Commented:
Good question, I tried also to choose your solution but I don't know if there is a way to select many accepted solutions, do you know?
0
 
RobSampsonCommented:
You can select multiple solutions.  I will re-open the question for you to re-close.

Rob.
0
 
LD16Author Commented:
Aikimar, if you are interested, I have opened another question:
http://www.experts-exchange.com/Programming/Languages/Visual_Basic/VB_Script/Q_28618822.html
0
 
RobSampsonCommented:
The selected answers don't necessarily reflect the correct scripts.  Would you like me to re-open it for you to double-check?
0
 
LD16Author Commented:
Yes, sorry for this. Thank you.
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 11
  • 6
  • 5
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now