Link to home
Start Free TrialLog in
Avatar of totalimpact
totalimpact

asked on

VBscript to compare and merge log files

I am looking for a vbs file to compare some log files - the files will have the same name, but one will be a more recent copy, and will have more data. I want to take the 2 files, and merge them into a 3rd file, a "master" file. - heres a visual:

Log1 <> Log1New ---
New log entries go to Masterfile and a line is marked to show where last record was entered.

My logs are csv files using ; as the delimiter, there is also a header, shown is a sample of the csv contents. I am thinking to compare the 2nd time stamp, and the data after it as unique arrays, for instance from my example:

03/04/11 12:51 PM;88.57

ideas??
"Orig1";03/04/11 12:50 PM;03/04/11 12:51 PM;88.57;90;6;6;0;0;"null";"184.191.122.250";"ergqwe3e";"OH";"4r2t3r333";;"CA";;;0

Open in new window

Avatar of prashanthd
prashanthd
Flag of India image

Can we not just check by last modified time of the log files?
Avatar of Bill Prew
Bill Prew

Will one file always be a subset of the other, or could there be unique records in both files that need to be merged into the combined result?

~bp
are you looking for something like this: compare the lines in the file like a merge-sort, you can change the messages to what you like of course depending on the situation: same lines no message new line in file1 might never happen, new line in file2 would be what you are looking for?

Option Explicit

Dim FSO, File1, File2, Mast
Set FSO = CreateObject("Scripting.FileSystemObject")
Set File1 = FSO.OpenTextFile("input1.csv", 1)
Set File2 = FSO.OpenTextFile("input2.csv", 1)
Set Mast = FSO.OpenTextFile("mast.txt", 8, True)

Mast.WriteLine "Start run - " & Now()
Dim s1,a1,k1,s2,a2,k2
Read1
Read2
While k1 <> "" Or k2 <> ""
	If k1<> "" and k1 = k2 Then ' same
		Mast.WriteLine "same: " & k1
		Read1
		Read2
	ElseIf k2 = "" Or (k1 <> "" And k1 < k2) Then
		Mast.WriteLine "new line in input1: " & s1
		Read1
	ElseIf k1 = "" Or (k2 <> "" And k2 < k1) Then
		Mast.WriteLine "new line in input2: " & s2
		Read2
	End If
Wend
Mast.WriteLine "End run - " & Now()
File1.Close
File2.Close
Mast.Close
Set File1 = Nothing
Set File2 = Nothing
Set Mast = Nothing
Set FSO = Nothing

Sub Read1
	If Not File1.AtEndOfStream Then
		s1 = File1.ReadLine
		a1 = Split(s1, ";")
		k1 = a1(2) & ";" & a1(3)
		Mast.WriteLine "* k1=" & k1
	Else
		s1 = ""
		k1 = ""
	End If
End Sub

Sub Read2
	If Not File2.AtEndOfStream Then
		s2 = File2.ReadLine
		a2 = Split(s2, ";")
		k2 = a2(2) & ";" & a2(3)
		Mast.WriteLine "* k2=" & k2
	Else
		s2 = ""
		k2 = ""
	End If
End Sub

Open in new window

Avatar of totalimpact

ASKER

I cant go by the modified date, because they are being copied from another server (ftp), and the date changes on copy - even though the contents may be 100% the same.

Since they are logs, the newer file (Log1NEW) will already have the data that is in the older file (Log1) - so its kind of a 1 way synchronization that I need.

Oftentimes they will already be the same with no difference, so if I could just compare the last line in each file before going through to parse the entire file, this would make it faster.

Robert - Your code is close, but I would need to loop through several files, because the filenames are not static. The files will be laid out in a group of subfolders by the datename - like a folder named 20110309.

Some more details on folder structure:
Archivedir - has all the previously downloaded logs in separate folders named by date (YYYYMMDD).
Tempdir - has newly downloaded files in separate folders also named by date  (YYYYMMDD).

If none of the subdirs in the temp dir exist in the Archivedir, they can just be copied over, if some of them exist, I want to compare to see if I already have the data in Archivedir, if I do, then the files can be deleted from the tempdir, if not, then any new data must be appended to mast.txt, and then the Tempdir files can be copied to the Archivedir overwriting any existing data.

Once all *new* data from the temp dirs has been appended to the mast.txt, I want to add a line in mast.txt with the date time, and several - marks like so:

"-";03/09/11 12:50 PM;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-
what about fc - would this be simpler?
http://technet.microsoft.com/en-us/library/bb490904.aspx
Hi,

Please try the following code.

regards
Prashanth
On Error Resume Next

archived_Path="c:\test\arc" 'archived folder path
temp_path="c:\test\temp" 'tempfolderpath
Master_file_path="c:\test\masterfile.txt" 'master file path

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objFolder = objFSO.GetFolder(temp_path)
'WScript.Echo objFolder.Path

For Each Subfolder In objFolder.SubFolders
    WScript.Echo Subfolder.Path & " - " & temp_path & " - " & archived_path
    subfolder_ar=Replace(LCase(Subfolder.Path),LCase(temp_path),LCase(archived_Path))
    
    WScript.Echo subfolder_ar
    If objfso.FolderExists(subfolder_ar) Then
        Set objsubFolder = objFSO.GetFolder(Subfolder.Path)
        
        Set colFiles = objsubFolder.Files
        For Each objFile In colFiles
            WScript.Echo objfile.Path
            Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
            Do Until objFiletemp.AtEndOfStream
                strlinetemp=objFiletemp.ReadLine
                Lastline_temp = strlinetemp
            Loop
            objFiletemp.Close
            
            objfilear_path=Replace(LCase(objfile.Path),LCase(temp_path),LCase(archived_Path))
            WScript.Echo objfilear_path
            If objfso.FileExists(objfilear_path) Then
                Set objFilear = objFSO.OpenTextFile(objfilear_path, 1)
                Do Until objFilear.AtEndOfStream
                    strlinear=objFilear.ReadLine
                    'WScript.Echo strlinear
                    Lastline_ar = strlinear
                Loop
                objFilear.Close
            End If
            
            If StrComp(Lastline_temp,Lastline_ar)<>0 Then
                Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
                file_ar_readall=objfiletemp.ReadAll
                objFiletemp.Close
                Err.Clear
                Set objMaster=objFSO.OpenTextFile(Master_file_path, 8)
                objMaster.Write file_ar_readall
                objMaster.WriteLine "-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-"
                objMaster.Close
                If Err.Number=0 Then
                	WScript.Echo "in copy file 1 - " & Replace(LCase(objfilear_path),LCase(objfile.Name),"")
                	objfso.CopyFile objfile.Path,Replace(LCase(objfilear_path),LCase(objfile.Name),""),true
                End If
            Else
                objfso.DeleteFile(objfile.Path)
                WScript.Echo "in delete file 2"
            End If    
        Next
    Else
        objfso.CopyFolder Subfolder.Path,subfolder_ar
        WScript.Echo "in copy folder 3"
    End If
Next

Open in new window

That is closer, 2 issues:

1. Right now only files that exist in both archive and temp folders are merging in to master.txt

This is good, but I also need any files that dont exist in archive to be merged into master.txt.
Any files that do exist in both arc and temp and are 100% the same should be deleted from temp without merging into master.txt

2. Minor - but it doesnt delete any temp folders or files on the first run, if I run it again, then it deletes the files from temp - but not the folders.
Once this runs and all needed data has been merged to master.txt, temp should be empty (at least now its leaving subfolders in temp).
Attached is an example of what is left behind after running the script - notice temp, it should be empty. User generated image
Can you clarify "Any files that do exist in both arc and temp and are 100% the same should be deleted from temp without merging into master.txt"

Made some changes...try the following

On Error Resume Next

archived_Path="c:\test\arc" 'archived folder path
temp_path="c:\test\temp" 'tempfolderpath
Master_file_path="c:\test\masterfile.txt" 'master file path

Set objFSO = CreateObject("Scripting.FileSystemObject")

Set objFolder = objFSO.GetFolder(temp_path)
'WScript.Echo objFolder.Path

For Each Subfolder In objFolder.SubFolders
    WScript.Echo Subfolder.Path & " - " & temp_path & " - " & archived_path
    subfolder_ar=Replace(LCase(Subfolder.Path),LCase(temp_path),LCase(archived_Path))
   
    WScript.Echo subfolder_ar
    If objfso.FolderExists(subfolder_ar) Then
        Set objsubFolder = objFSO.GetFolder(Subfolder.Path)
       
        Set colFiles = objsubFolder.Files
        For Each objFile In colFiles
            WScript.Echo objfile.Path
            Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
            Do Until objFiletemp.AtEndOfStream
                strlinetemp=objFiletemp.ReadLine
                Lastline_temp = strlinetemp
            Loop
            objFiletemp.Close
           
            objfilear_path=Replace(LCase(objfile.Path),LCase(temp_path),LCase(archived_Path))
            WScript.Echo objfilear_path
            If objfso.FileExists(objfilear_path) Then
                Set objFilear = objFSO.OpenTextFile(objfilear_path, 1)
                Do Until objFilear.AtEndOfStream
                    strlinear=objFilear.ReadLine
                    'WScript.Echo strlinear
                    Lastline_ar = strlinear
                Loop
                objFilear.Close
            End If
           
            If (StrComp(Lastline_temp,Lastline_ar)<>0) Or (objfso.FileExists(objfilear_path)=False)  Then
                Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
                file_ar_readall=objfiletemp.ReadAll
                objFiletemp.Close
                Err.Clear
                Set objMaster=objFSO.OpenTextFile(Master_file_path, 8)
                objMaster.Write file_ar_readall
                objMaster.WriteLine "-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-"
                objMaster.Close
                If Err.Number=0 Then
                      WScript.Echo "in copy file 1 - " & Replace(LCase(objfilear_path),LCase(objfile.Name),"")
                      objfso.MoveFile objfile.Path,Replace(LCase(objfilear_path),LCase(objfile.Name),""),true
                End If
            Else
                objfso.DeleteFile(objfile.Path)
                WScript.Echo "in delete file 2"
            End If    
        Next
    Else
        objfso.CopyFolder Subfolder.Path,subfolder_ar
        WScript.Echo "in copy folder 3"
    End If
Next

Set objFolder = objFSO.GetFolder(temp_path)

For Each Subfolder In objFolder.SubFolders
      If subfolder.Files.Count = 0 Then
            subfolder.Delete
      End If
Next
What I mean is if both folders have the same files, and the file contents are the same, then this means I have previously downloaded them, and they can be deleted from temp without any further processing.

I wont have time to test this until later tonight, will let you know what I find out, thanks for all your help.
ok, so I tested that, but its still having trouble on the delete.

1st time i run it, it merges only the files that exist in both archive and temp dir in to master.txt (if they have same name, but one has additional contents).
If there are other files in temp that dont exist in archive, it is ignoring them, and not moving them to archive, and not merging them to master.txt - this would mean data loss to me.

If I run it a 2nd time - it deletes all files and dirs from temp without merging them into master.txt, but it leaves only the single file that existed in both temp and archive but had slightly different contents, and it appended this data a second time to master.txt - which would not have happened if this file had copied over from temp.
---------------------------------

If it makes it simpler - I guess all files from tempdir can be moved to archivedir overwriting any data in there **as long as any new missing data gets merged into master.txt before being moved**

Basically I need to perform a difference check, and write the differences that are not in archive into the master.txt, then archive them, cause there will be times when files have the same names in 2 folders, but 1 has slightly more data in it.
I uploaded a full copy of my folder structure as a zip file here:
https://docs.google.com/leaf?id=0B0GcWwG3zysmZDk5ODJjMDItZWE2YS00Y2Q3LWIzNGUtZTljZWYzNmY4OGEw&hl=en

I cannot have this sit indefinitely on ee, so i had to use google docs.
Try the following...

regards
Prashanth
On Error Resume Next

archived_Path="c:\test\arc" 'archived folder path
temp_path="c:\test\temp" 'tempfolderpath
Master_file_path="c:\test\masterfile.txt" 'master file path

Set objFSO = CreateObject("Scripting.FileSystemObject")

If objfso.FileExists(Master_file_path) = False Then
    objfso.CreateTextFile(Master_file_path)
End if

Set objFolder = objFSO.GetFolder(temp_path)
'WScript.Echo objFolder.Path

For Each Subfolder In objFolder.SubFolders
    WScript.Echo Subfolder.Path & " - " & temp_path & " - " & archived_path
    subfolder_ar=Replace(LCase(Subfolder.Path),LCase(temp_path),LCase(archived_Path))
    
    WScript.Echo subfolder_ar
    If objfso.FolderExists(subfolder_ar) Then
        Set objsubFolder = objFSO.GetFolder(Subfolder.Path)
        
        Set colFiles = objsubFolder.Files
        For Each objFile In colFiles
            WScript.Echo objfile.Path
            Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
            Do Until objFiletemp.AtEndOfStream
                strlinetemp=objFiletemp.ReadLine
                Lastline_temp = strlinetemp
            Loop
            objFiletemp.Close
            
            objfilear_path=Replace(LCase(objfile.Path),LCase(temp_path),LCase(archived_Path))
            WScript.Echo objfilear_path
            If objfso.FileExists(objfilear_path) Then
                Set objFilear = objFSO.OpenTextFile(objfilear_path, 1)
                Do Until objFilear.AtEndOfStream
                    strlinear=objFilear.ReadLine
                    'WScript.Echo strlinear
                    Lastline_ar = strlinear
                Loop
                objFilear.Close
            End If
            WScript.Echo objfso.FileExists(objfilear_path)
            If (StrComp(Lastline_temp,Lastline_ar)<>0) Or (objfso.FileExists(objfilear_path)=False)  Then
                WScript.Echo objfile.Path & " - in if strcomp"
                Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
                file_ar_readall=objfiletemp.ReadAll
                objFiletemp.Close
                Err.Clear
                Set objMaster=objFSO.OpenTextFile(Master_file_path, 8)
                objMaster.Write file_ar_readall
                objMaster.WriteLine "-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-"
                objMaster.Close
                If Err.Number=0 Then
                    WScript.Echo "in move file 1 - " & Replace(LCase(objfilear_path),LCase(objfile.Name),"")
                    Err.Clear
                    objfso.copyFile objfile.Path,Replace(LCase(objfilear_path),LCase(objfile.Name),""),True
                    objfso.DeleteFile objfile.Path
                    If Err.Number<>0 Then
                        WScript.Echo Err.Number & Err.Description
                    End If
                End If
            Else
                objfso.DeleteFile(objfile.Path)
                WScript.Echo "in delete file 2"
            End If    
        Next
    Else
        Set objsubFolder = objFSO.GetFolder(Subfolder.Path)
        Set colFiles = objsubFolder.Files
        For Each objFile In colFiles
            WScript.Echo objfile.Path & " - in move folder copy file content"
            Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
            file_ar_readall=objfiletemp.ReadAll
            objFiletemp.Close
            Set objMaster=objFSO.OpenTextFile(Master_file_path, 8)
            objMaster.Write file_ar_readall
            objMaster.WriteLine "-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-"
            objMaster.Close
        Next    
            objfso.MoveFolder Subfolder.Path,subfolder_ar
            WScript.Echo "in move folder 3"
    End If
Next

Set objFolder = objFSO.GetFolder(temp_path)

For Each Subfolder In objFolder.SubFolders
      If subfolder.Files.Count = 0 Then
      		WScript.Echo subfolder.Path & " - Delete folder"
            subfolder.Delete
      End If
Next

Open in new window

great!!! 99% there, just one little thing if you could -

Can it only add the date and lines 1 time after processing all files (instead of for each file):
"-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-"
ASKER CERTIFIED SOLUTION
Avatar of prashanthd
prashanthd
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Absolutely perfect