Solved

Remove lines with Vb script

Posted on 2015-02-16
28
133 Views
Last Modified: 2015-02-17
Hello,

I have two csv files:
file1.csv with the following information:

1111;A8;5555
3455;B7;6778
5767;C3;6770
2222;A8;6666

file2.csv with the following information

C4;aaaa
A8;cccc
C3;dddd

I need a vbscript which reads both files and create a file1_revised.csv without lines which are not in column A of file2.

Final result of file1_revised.csv:
1111;A8;5555
5767;C3;6770
2222;A8;6666
0
Comment
Question by:LD16
  • 11
  • 6
  • 5
  • +1
28 Comments
 
LVL 45

Expert Comment

by:aikimark
ID: 40613086
Please test this:
Sub Q_28618048()
    Const ForReading = 1, ForWriting = 2, ForAppending = 3
    Dim oFS, oTS
    Dim strFile1Data, strFile2Data
    Dim vParsedData, vParsedLine, vItem
    Dim dicUnique
    Set oFS = CreateObject("Scripting.FileSystemObject")
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1.csv", ForReading, False)
    strFile1Data = oTS.readall
    oTS.Close
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file2.csv", ForReading, False)
    strFile2Data = oTS.readall
    oTS.Close
    
    Set dicUnique = CreateObject("Scripting.dictionary")
    
    Dim oRE As Object
    Dim oMatches As Object
    Dim oM As Object
    Dim oSM As Object
    Dim lngSM As Long

    Set oRE = CreateObject("vbscript.regexp")
    oRE.Global = True
    oRE.Pattern = "([^;]*);.*(?:\r\n|$)"

    If oRE.test(strFile2Data) Then
        Set oMatches = oRE.Execute(strFile2Data)
        For Each oM In oMatches
            With oM
                If dicUnique.exists(.submatches(0)) Then
                Else
                    dicUnique.Add .submatches(0), 1
                End If
            End With
        Next
    End If
    
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1_revised.csv ", ForWriting, True)
    vParsedData = Split(strFile1Data, vbCrLf)
    For Each vItem In vParsedData
        vParsedLine = Split(vItem, ";")
        If dicUnique.exists(vParsedLine(1)) Then
            oTS.writeline vItem
        End If
    Next
    oTS.Close
    
End Sub

Open in new window

0
 
LVL 65

Expert Comment

by:RobSampson
ID: 40613219
Hi, here's a Powershell version, if you're interested:
$file1 = Import-CSV ".\test1.txt" -Header "Col1","Col2","Col3" -Delimiter ";"
$file2 = Import-CSV ".\test2.txt" -Header "Col2","Col3" -Delimiter ";"
$file1 | Where Col2 -In $file2.Col2 | ConvertTo-CSV -Delimiter ";" -NoTypeInformation | Select -Skip 1 | ForEach {$_ -replace '"', ''} | Out-File ".\test1_revised.txt"

Open in new window


Regards,

Rob.
0
 
LVL 51

Expert Comment

by:Bill Prew
ID: 40613392
Got an idea for a slightly different approach, hope I can get it to work...

~bp
0
 
LVL 51

Expert Comment

by:Bill Prew
ID: 40613444
Okay, here's a different VBS approach.  I used ADO to allow treating the CSV files as "databases tables", and then can do a JOIN of the column you wanted to match on.  Seems to work when I ran it here.

Save as VBS in the folder with the files, adjust file names and path as needed. Also store the INI file below in the same folder and it must be named "schema.ini", it defines the format and columns in each CSV file.

Let me know what questions you have...

' Define needed constants
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2
 
' Define files to work with
strFileIn1 = "file1.csv"
strFileIn2 = "file2.csv"
strFileOut = "out.csv"
strFolder="B:\ee\EE28618048"
 
' Define needed objects 
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")

' Set up ADO to access the CSV files and join on reference column
objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & strFolder & ";Extended Properties='text;HDR=NO;FMT=Delimited'"
objRecordset.Open "SELECT * FROM " & strFileIn1 & ", " & strFileIn2 & " WHERE C21=C12", objConnection

' Open output file
Set objFile = objFSO.OpenTextFile(strFileOut, ForWriting, True)

' Read all records and reformat before writing to output file
Do Until objRecordset.EOF
   objFile.WriteLine objRecordset.Fields("C11").Value & ";" & _
   objRecordset.Fields("C12").Value & ";" & _
   objRecordset.Fields("C13").Value
   objRecordset.MoveNext
Loop

' Wrapup
objFile.Close
objRecordset.Close
objConnection.Close

Open in new window


[file1.csv]
Format=Delimited(;)
ColNameHeader=False
Col1=C11 Text
Col2=C12 Text
Col3=C13 Text

[file2.csv]
Format=Delimited(;)
ColNameHeader=False
Col1=C21 Text
Col2=C22 Text

Open in new window

~bp
0
 

Author Comment

by:LD16
ID: 40613769
Hello Bill,

I tried, your script however I have duplicate values on  the out.csv.
Please find bellow the result of out.csv:
1111;A8;5555
1111;A8;5555
2222;A8;6666
2222;A8;6666
5767;C3;6770

And the result should be the following:
1111;A8;5555
2222;A8;6666
5767;C3;6770

Thank you again for your help.
0
 

Author Comment

by:LD16
ID: 40613776
Hello Aikimar

I have a  Compilation error: Expected end of statement on line 16 caract 13. Thank you again for your help.
0
 
LVL 65

Assisted Solution

by:RobSampson
RobSampson earned 166 total points
ID: 40613935
Hi, if the Powershell script isn't going to suit your needs, here is my version of a VBS approach.

Regards,

Rob.

strFile1 = "C:\Temp\Scripts\EE\File1.txt"
strFile2 = "C:\Temp\Scripts\EE\File2.txt"
strOutput = "C:\Temp\Scripts\EE\File1_Revised.txt"

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile2Dct = CreateObject("Scripting.Dictionary")
objFile2Dct.CompareMode = vbTextCompare

' Create the output file
Set objOutput = objFSO.CreateTextFile(strOutput, True)

' First read file 2 to obtain the "key" values
Set objFile2 = objFSO.OpenTextFile(strFile2, 1, False)
While Not objFile2.AtEndOfStream
	strLine = objFile2.ReadLine
	strKey = Split(strLine, ";")(0)
	If objFile2Dct.Exists(strKey) = False Then objFile2Dct.Add strKey, 0
Wend
objFile2.Close

' Then read through the first file and see if the "key" matches those found from the second
Set objFile1 = objFSO.OpenTextFile(strFile1, 1, False)
While Not objFile1.AtEndOfStream
	strLine = objFile1.ReadLine
	strKey = Split(strLine, ";")(1)
	If objFile2Dct.Exists(strKey) = True Then objOutput.WriteLine strLine
Wend
objFile1.Close

' Close the output file
objOutput.Close

WScript.Echo "Done"

Open in new window

0
 
LVL 65

Expert Comment

by:RobSampson
ID: 40613937
For aikimark's to work, you need to remove the "As [Type]" parts of lines 17 through 21, and remove the first and last lines if you are using a pure text VBScript file.

Rob.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40613977
Oops.  I forgot to remove the data types in the definitions.  Please test this.
Sub Q_28618048()
    Const ForReading = 1, ForWriting = 2, ForAppending = 3
    Dim oFS, oTS
    Dim strFile1Data, strFile2Data
    Dim vParsedData, vParsedLine, vItem
    Dim dicUnique
    Set oFS = CreateObject("Scripting.FileSystemObject")
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1.csv", ForReading, False)
    strFile1Data = oTS.readall
    oTS.Close
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file2.csv", ForReading, False)
    strFile2Data = oTS.readall
    oTS.Close
    
    Set dicUnique = CreateObject("Scripting.dictionary")
    
    Dim oRE 
    Dim oMatches 
    Dim oM 

    Set oRE = CreateObject("vbscript.regexp")
    oRE.Global = True
    oRE.Pattern = "([^;]*);.*(?:\r\n|$)"

    If oRE.test(strFile2Data) Then
        Set oMatches = oRE.Execute(strFile2Data)
        For Each oM In oMatches
            With oM
                If dicUnique.exists(.submatches(0)) Then
                Else
                    dicUnique.Add .submatches(0), 1
                End If
            End With
        Next
    End If
    
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1_revised.csv ", ForWriting, True)
    vParsedData = Split(strFile1Data, vbCrLf)
    For Each vItem In vParsedData
        vParsedLine = Split(vItem, ";")
        If dicUnique.exists(vParsedLine(1)) Then
            oTS.writeline vItem
        End If
    Next
    oTS.Close
    
End Sub

Open in new window

0
 

Author Comment

by:LD16
ID: 40614023
Hello,
Your vbs work prefectly. One question came across. How can include the headers of file 1
0
 

Author Comment

by:LD16
ID: 40614028
Hello Rob,
Your vbs work prefectly. One question came across. How can include the headers of file 1 in the File1_Revised?

Thank you again for your help.

Regards,
0
 
LVL 45

Accepted Solution

by:
aikimark earned 167 total points
ID: 40614046
This seems to work with headers.  Since your posted data did not include headers, I had to create some to test this revised code.
Sub Q_28618048()
    Const ForReading = 1, ForWriting = 2, ForAppending = 3
    Dim oFS, oTS
    Dim strFile1Data, strFile2Data
    Dim vParsedData, vParsedLine, vItem
    Dim dicUnique
    Set oFS = CreateObject("Scripting.FileSystemObject")
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1.csv", ForReading, False)
    strFile1Data = oTS.readall
    oTS.Close
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file2.csv", ForReading, False)
    strFile2Data = oTS.readall
    oTS.Close
    
    Set dicUnique = CreateObject("Scripting.dictionary")
    
    Dim oRE
    Dim oMatches
    Dim oM

    Set oRE = CreateObject("vbscript.regexp")
    oRE.Global = True
    oRE.Pattern = "([^;]*);.*(?:\r\n|$)"

    If oRE.test(strFile2Data) Then
        Set oMatches = oRE.Execute(strFile2Data)
        For Each oM In oMatches
            With oM
                If dicUnique.exists(.submatches(0)) Then
                Else
                    dicUnique.Add .submatches(0), 1
                End If
            End With
        Next
    End If
    
    Set oTS = oFS.OpenTextFile("c:\users\aikimark\downloads\file1_revised.csv ", ForWriting, True)
    vParsedData = Split(strFile1Data, vbCrLf)
    oTS.writeline vParsedData(0)
    For Each vItem In vParsedData
        vParsedLine = Split(vItem, ";")
        If dicUnique.exists(vParsedLine(1)) Then
            oTS.writeline vItem
        End If
    Next
    oTS.Close
    
End Sub

Open in new window

0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 

Author Comment

by:LD16
ID: 40614080
Hello Aikimark,

I was not able to run your revised vbscript any error message however I putted a wcript.Echo and I was unable to display the echo.  I modified the path and create the csv files as mention in my last comment.
Additionnally I tried to launch the script without file1_revised.csv and with a blank version of file1_revised.csv.

Thank you again for your help.

Regards,
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40614238
Since my script is a routine, did you invoke the routine or just try to run the VBS as submitted?

Did you change the path in the string literals?
0
 

Author Comment

by:LD16
ID: 40614361
Hello,

I've just tried, to run the vbs.

I've changed just the path of oTS variable.

Could you please let me know how to invoke the routine?

Regards,
0
 

Author Comment

by:LD16
ID: 40614406
Hello Rob,

I have included a objOutput.WriteLine for the headers of the revised file.

Thank you very much for your help.

Regards,
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40614436
you just call the routine by name somewhere in your VBS script.

Optionally, you could remove the Sub and End Sub lines.
0
 
LVL 51

Assisted Solution

by:Bill Prew
Bill Prew earned 167 total points
ID: 40615227
Here's a small change to remove the duplicates in the approach I shared.

' Define needed constants
Const ForReading = 1
Const ForWriting = 2
Const TriStateUseDefault = -2
 
' Define files to work with
strFileIn1 = "file1.csv"
strFileIn2 = "file2.csv"
strFileOut = "out.csv"
strFolder="B:\ee\EE28618048"
 
' Define needed objects 
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")

' Set up ADO to access the CSV files and join on reference column
objConnection.Open "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & strFolder & ";Extended Properties='text;HDR=NO;FMT=Delimited'"
objRecordset.Open "SELECT DISTINCT * FROM " & strFileIn1 & ", " & strFileIn2 & " WHERE C21=C12", objConnection

' Open output file
Set objFile = objFSO.OpenTextFile(strFileOut, ForWriting, True)

' Read all records and reformat before writing to output file
Do Until objRecordset.EOF
   objFile.WriteLine objRecordset.Fields("C11").Value & ";" & _
   objRecordset.Fields("C12").Value & ";" & _
   objRecordset.Fields("C13").Value
   objRecordset.MoveNext
Loop

' Wrapup
objFile.Close
objRecordset.Close
objConnection.Close

Open in new window

~bp
0
 

Author Comment

by:LD16
ID: 40615321
Hello,

Both scripts work properly. Thank you very much for help.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40615347
Both scripts work properly
If both scripts worked, why did you choose only one as the solution?
0
 

Author Comment

by:LD16
ID: 40615360
Good question, I tried also to choose your solution but I don't know if there is a way to select many accepted solutions, do you know?
0
 
LVL 65

Expert Comment

by:RobSampson
ID: 40615371
You can select multiple solutions.  I will re-open the question for you to re-close.

Rob.
0
 

Author Comment

by:LD16
ID: 40615376
Aikimar, if you are interested, I have opened another question:
http://www.experts-exchange.com/Programming/Languages/Visual_Basic/VB_Script/Q_28618822.html
0
 
LVL 65

Expert Comment

by:RobSampson
ID: 40615392
The selected answers don't necessarily reflect the correct scripts.  Would you like me to re-open it for you to double-check?
0
 

Author Comment

by:LD16
ID: 40615408
Yes, sorry for this. Thank you.
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Not long ago I saw a question in the VB Script forum that I thought would not take much time. You can read that question (Question ID  (http://www.experts-exchange.com/Programming/Languages/Visual_Basic/VB_Script/Q_28455246.html)28455246) Here (http…
I'd like to talk about something that is near and dear to my heart: build systems. Without them, building software is all about compiling locally, with software versions everywhere. It can be a mess. Today we are going to discuss building a small di…
This tutorial will introduce the viewer to VisualVM for the Java platform application. This video explains an example program and covers the Overview, Monitor, and Heap Dump tabs.
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now