totalimpact
asked on
VBscript to compare and merge log files
I am looking for a vbs file to compare some log files - the files will have the same name, but one will be a more recent copy, and will have more data. I want to take the 2 files, and merge them into a 3rd file, a "master" file. - heres a visual:
Log1 <> Log1New ---
New log entries go to Masterfile and a line is marked to show where last record was entered.
My logs are csv files using ; as the delimiter, there is also a header, shown is a sample of the csv contents. I am thinking to compare the 2nd time stamp, and the data after it as unique arrays, for instance from my example:
03/04/11 12:51 PM;88.57
ideas??
Log1 <> Log1New ---
New log entries go to Masterfile and a line is marked to show where last record was entered.
My logs are csv files using ; as the delimiter, there is also a header, shown is a sample of the csv contents. I am thinking to compare the 2nd time stamp, and the data after it as unique arrays, for instance from my example:
03/04/11 12:51 PM;88.57
ideas??
"Orig1";03/04/11 12:50 PM;03/04/11 12:51 PM;88.57;90;6;6;0;0;"null";"184.191.122.250";"ergqwe3e";"OH";"4r2t3r333";;"CA";;;0
Can we not just check by last modified time of the log files?
Will one file always be a subset of the other, or could there be unique records in both files that need to be merged into the combined result?
~bp
~bp
are you looking for something like this: compare the lines in the file like a merge-sort, you can change the messages to what you like of course depending on the situation: same lines no message new line in file1 might never happen, new line in file2 would be what you are looking for?
Option Explicit
Dim FSO, File1, File2, Mast
Set FSO = CreateObject("Scripting.FileSystemObject")
Set File1 = FSO.OpenTextFile("input1.csv", 1)
Set File2 = FSO.OpenTextFile("input2.csv", 1)
Set Mast = FSO.OpenTextFile("mast.txt", 8, True)
Mast.WriteLine "Start run - " & Now()
Dim s1,a1,k1,s2,a2,k2
Read1
Read2
While k1 <> "" Or k2 <> ""
If k1<> "" and k1 = k2 Then ' same
Mast.WriteLine "same: " & k1
Read1
Read2
ElseIf k2 = "" Or (k1 <> "" And k1 < k2) Then
Mast.WriteLine "new line in input1: " & s1
Read1
ElseIf k1 = "" Or (k2 <> "" And k2 < k1) Then
Mast.WriteLine "new line in input2: " & s2
Read2
End If
Wend
Mast.WriteLine "End run - " & Now()
File1.Close
File2.Close
Mast.Close
Set File1 = Nothing
Set File2 = Nothing
Set Mast = Nothing
Set FSO = Nothing
Sub Read1
If Not File1.AtEndOfStream Then
s1 = File1.ReadLine
a1 = Split(s1, ";")
k1 = a1(2) & ";" & a1(3)
Mast.WriteLine "* k1=" & k1
Else
s1 = ""
k1 = ""
End If
End Sub
Sub Read2
If Not File2.AtEndOfStream Then
s2 = File2.ReadLine
a2 = Split(s2, ";")
k2 = a2(2) & ";" & a2(3)
Mast.WriteLine "* k2=" & k2
Else
s2 = ""
k2 = ""
End If
End Sub
ASKER
I cant go by the modified date, because they are being copied from another server (ftp), and the date changes on copy - even though the contents may be 100% the same.
Since they are logs, the newer file (Log1NEW) will already have the data that is in the older file (Log1) - so its kind of a 1 way synchronization that I need.
Oftentimes they will already be the same with no difference, so if I could just compare the last line in each file before going through to parse the entire file, this would make it faster.
Robert - Your code is close, but I would need to loop through several files, because the filenames are not static. The files will be laid out in a group of subfolders by the datename - like a folder named 20110309.
Some more details on folder structure:
Archivedir - has all the previously downloaded logs in separate folders named by date (YYYYMMDD).
Tempdir - has newly downloaded files in separate folders also named by date (YYYYMMDD).
If none of the subdirs in the temp dir exist in the Archivedir, they can just be copied over, if some of them exist, I want to compare to see if I already have the data in Archivedir, if I do, then the files can be deleted from the tempdir, if not, then any new data must be appended to mast.txt, and then the Tempdir files can be copied to the Archivedir overwriting any existing data.
Once all *new* data from the temp dirs has been appended to the mast.txt, I want to add a line in mast.txt with the date time, and several - marks like so:
"-";03/09/11 12:50 PM;-;-;-;-;-;-;-;-;-;-;-;- ;-;-;-;-;-
Since they are logs, the newer file (Log1NEW) will already have the data that is in the older file (Log1) - so its kind of a 1 way synchronization that I need.
Oftentimes they will already be the same with no difference, so if I could just compare the last line in each file before going through to parse the entire file, this would make it faster.
Robert - Your code is close, but I would need to loop through several files, because the filenames are not static. The files will be laid out in a group of subfolders by the datename - like a folder named 20110309.
Some more details on folder structure:
Archivedir - has all the previously downloaded logs in separate folders named by date (YYYYMMDD).
Tempdir - has newly downloaded files in separate folders also named by date (YYYYMMDD).
If none of the subdirs in the temp dir exist in the Archivedir, they can just be copied over, if some of them exist, I want to compare to see if I already have the data in Archivedir, if I do, then the files can be deleted from the tempdir, if not, then any new data must be appended to mast.txt, and then the Tempdir files can be copied to the Archivedir overwriting any existing data.
Once all *new* data from the temp dirs has been appended to the mast.txt, I want to add a line in mast.txt with the date time, and several - marks like so:
"-";03/09/11 12:50 PM;-;-;-;-;-;-;-;-;-;-;-;-
ASKER
what about fc - would this be simpler?
http://technet.microsoft.com/en-us/library/bb490904.aspx
http://technet.microsoft.com/en-us/library/bb490904.aspx
Hi,
Please try the following code.
regards
Prashanth
Please try the following code.
regards
Prashanth
On Error Resume Next
archived_Path="c:\test\arc" 'archived folder path
temp_path="c:\test\temp" 'tempfolderpath
Master_file_path="c:\test\masterfile.txt" 'master file path
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFSO.GetFolder(temp_path)
'WScript.Echo objFolder.Path
For Each Subfolder In objFolder.SubFolders
WScript.Echo Subfolder.Path & " - " & temp_path & " - " & archived_path
subfolder_ar=Replace(LCase(Subfolder.Path),LCase(temp_path),LCase(archived_Path))
WScript.Echo subfolder_ar
If objfso.FolderExists(subfolder_ar) Then
Set objsubFolder = objFSO.GetFolder(Subfolder.Path)
Set colFiles = objsubFolder.Files
For Each objFile In colFiles
WScript.Echo objfile.Path
Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
Do Until objFiletemp.AtEndOfStream
strlinetemp=objFiletemp.ReadLine
Lastline_temp = strlinetemp
Loop
objFiletemp.Close
objfilear_path=Replace(LCase(objfile.Path),LCase(temp_path),LCase(archived_Path))
WScript.Echo objfilear_path
If objfso.FileExists(objfilear_path) Then
Set objFilear = objFSO.OpenTextFile(objfilear_path, 1)
Do Until objFilear.AtEndOfStream
strlinear=objFilear.ReadLine
'WScript.Echo strlinear
Lastline_ar = strlinear
Loop
objFilear.Close
End If
If StrComp(Lastline_temp,Lastline_ar)<>0 Then
Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
file_ar_readall=objfiletemp.ReadAll
objFiletemp.Close
Err.Clear
Set objMaster=objFSO.OpenTextFile(Master_file_path, 8)
objMaster.Write file_ar_readall
objMaster.WriteLine "-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-"
objMaster.Close
If Err.Number=0 Then
WScript.Echo "in copy file 1 - " & Replace(LCase(objfilear_path),LCase(objfile.Name),"")
objfso.CopyFile objfile.Path,Replace(LCase(objfilear_path),LCase(objfile.Name),""),true
End If
Else
objfso.DeleteFile(objfile.Path)
WScript.Echo "in delete file 2"
End If
Next
Else
objfso.CopyFolder Subfolder.Path,subfolder_ar
WScript.Echo "in copy folder 3"
End If
Next
ASKER
That is closer, 2 issues:
1. Right now only files that exist in both archive and temp folders are merging in to master.txt
This is good, but I also need any files that dont exist in archive to be merged into master.txt.
Any files that do exist in both arc and temp and are 100% the same should be deleted from temp without merging into master.txt
2. Minor - but it doesnt delete any temp folders or files on the first run, if I run it again, then it deletes the files from temp - but not the folders.
Once this runs and all needed data has been merged to master.txt, temp should be empty (at least now its leaving subfolders in temp).
1. Right now only files that exist in both archive and temp folders are merging in to master.txt
This is good, but I also need any files that dont exist in archive to be merged into master.txt.
Any files that do exist in both arc and temp and are 100% the same should be deleted from temp without merging into master.txt
2. Minor - but it doesnt delete any temp folders or files on the first run, if I run it again, then it deletes the files from temp - but not the folders.
Once this runs and all needed data has been merged to master.txt, temp should be empty (at least now its leaving subfolders in temp).
ASKER
Can you clarify "Any files that do exist in both arc and temp and are 100% the same should be deleted from temp without merging into master.txt"
Made some changes...try the following
On Error Resume Next
archived_Path="c:\test\arc " 'archived folder path
temp_path="c:\test\temp" 'tempfolderpath
Master_file_path="c:\test\ masterfile .txt" 'master file path
Set objFSO = CreateObject("Scripting.Fi leSystemOb ject")
Set objFolder = objFSO.GetFolder(temp_path )
'WScript.Echo objFolder.Path
For Each Subfolder In objFolder.SubFolders
WScript.Echo Subfolder.Path & " - " & temp_path & " - " & archived_path
subfolder_ar=Replace(LCase (Subfolder .Path),LCa se(temp_pa th),LCase( archived_P ath))
WScript.Echo subfolder_ar
If objfso.FolderExists(subfol der_ar) Then
Set objsubFolder = objFSO.GetFolder(Subfolder .Path)
Set colFiles = objsubFolder.Files
For Each objFile In colFiles
WScript.Echo objfile.Path
Set objFiletemp = objFSO.OpenTextFile(objfil e.Path, 1)
Do Until objFiletemp.AtEndOfStream
strlinetemp=objFiletemp.Re adLine
Lastline_temp = strlinetemp
Loop
objFiletemp.Close
objfilear_path=Replace(LCa se(objfile .Path),LCa se(temp_pa th),LCase( archived_P ath))
WScript.Echo objfilear_path
If objfso.FileExists(objfilea r_path) Then
Set objFilear = objFSO.OpenTextFile(objfil ear_path, 1)
Do Until objFilear.AtEndOfStream
strlinear=objFilear.ReadLi ne
'WScript.Echo strlinear
Lastline_ar = strlinear
Loop
objFilear.Close
End If
If (StrComp(Lastline_temp,Las tline_ar)< >0) Or (objfso.FileExists(objfile ar_path)=F alse) Then
Set objFiletemp = objFSO.OpenTextFile(objfil e.Path, 1)
file_ar_readall=objfiletem p.ReadAll
objFiletemp.Close
Err.Clear
Set objMaster=objFSO.OpenTextF ile(Master _file_path , 8)
objMaster.Write file_ar_readall
objMaster.WriteLine "-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;- ;-;-;-;-;- "
objMaster.Close
If Err.Number=0 Then
WScript.Echo "in copy file 1 - " & Replace(LCase(objfilear_pa th),LCase( objfile.Na me),"")
objfso.MoveFile objfile.Path,Replace(LCase (objfilear _path),LCa se(objfile .Name),"") ,true
End If
Else
objfso.DeleteFile(objfile. Path)
WScript.Echo "in delete file 2"
End If
Next
Else
objfso.CopyFolder Subfolder.Path,subfolder_a r
WScript.Echo "in copy folder 3"
End If
Next
Set objFolder = objFSO.GetFolder(temp_path )
For Each Subfolder In objFolder.SubFolders
If subfolder.Files.Count = 0 Then
subfolder.Delete
End If
Next
Made some changes...try the following
On Error Resume Next
archived_Path="c:\test\arc
temp_path="c:\test\temp" 'tempfolderpath
Master_file_path="c:\test\
Set objFSO = CreateObject("Scripting.Fi
Set objFolder = objFSO.GetFolder(temp_path
'WScript.Echo objFolder.Path
For Each Subfolder In objFolder.SubFolders
WScript.Echo Subfolder.Path & " - " & temp_path & " - " & archived_path
subfolder_ar=Replace(LCase
WScript.Echo subfolder_ar
If objfso.FolderExists(subfol
Set objsubFolder = objFSO.GetFolder(Subfolder
Set colFiles = objsubFolder.Files
For Each objFile In colFiles
WScript.Echo objfile.Path
Set objFiletemp = objFSO.OpenTextFile(objfil
Do Until objFiletemp.AtEndOfStream
strlinetemp=objFiletemp.Re
Lastline_temp = strlinetemp
Loop
objFiletemp.Close
objfilear_path=Replace(LCa
WScript.Echo objfilear_path
If objfso.FileExists(objfilea
Set objFilear = objFSO.OpenTextFile(objfil
Do Until objFilear.AtEndOfStream
strlinear=objFilear.ReadLi
'WScript.Echo strlinear
Lastline_ar = strlinear
Loop
objFilear.Close
End If
If (StrComp(Lastline_temp,Las
Set objFiletemp = objFSO.OpenTextFile(objfil
file_ar_readall=objfiletem
objFiletemp.Close
Err.Clear
Set objMaster=objFSO.OpenTextF
objMaster.Write file_ar_readall
objMaster.WriteLine "-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-
objMaster.Close
If Err.Number=0 Then
WScript.Echo "in copy file 1 - " & Replace(LCase(objfilear_pa
objfso.MoveFile objfile.Path,Replace(LCase
End If
Else
objfso.DeleteFile(objfile.
WScript.Echo "in delete file 2"
End If
Next
Else
objfso.CopyFolder Subfolder.Path,subfolder_a
WScript.Echo "in copy folder 3"
End If
Next
Set objFolder = objFSO.GetFolder(temp_path
For Each Subfolder In objFolder.SubFolders
If subfolder.Files.Count = 0 Then
subfolder.Delete
End If
Next
ASKER
What I mean is if both folders have the same files, and the file contents are the same, then this means I have previously downloaded them, and they can be deleted from temp without any further processing.
I wont have time to test this until later tonight, will let you know what I find out, thanks for all your help.
I wont have time to test this until later tonight, will let you know what I find out, thanks for all your help.
ASKER
ok, so I tested that, but its still having trouble on the delete.
1st time i run it, it merges only the files that exist in both archive and temp dir in to master.txt (if they have same name, but one has additional contents).
If there are other files in temp that dont exist in archive, it is ignoring them, and not moving them to archive, and not merging them to master.txt - this would mean data loss to me.
If I run it a 2nd time - it deletes all files and dirs from temp without merging them into master.txt, but it leaves only the single file that existed in both temp and archive but had slightly different contents, and it appended this data a second time to master.txt - which would not have happened if this file had copied over from temp.
-------------------------- -------
If it makes it simpler - I guess all files from tempdir can be moved to archivedir overwriting any data in there **as long as any new missing data gets merged into master.txt before being moved**
Basically I need to perform a difference check, and write the differences that are not in archive into the master.txt, then archive them, cause there will be times when files have the same names in 2 folders, but 1 has slightly more data in it.
1st time i run it, it merges only the files that exist in both archive and temp dir in to master.txt (if they have same name, but one has additional contents).
If there are other files in temp that dont exist in archive, it is ignoring them, and not moving them to archive, and not merging them to master.txt - this would mean data loss to me.
If I run it a 2nd time - it deletes all files and dirs from temp without merging them into master.txt, but it leaves only the single file that existed in both temp and archive but had slightly different contents, and it appended this data a second time to master.txt - which would not have happened if this file had copied over from temp.
--------------------------
If it makes it simpler - I guess all files from tempdir can be moved to archivedir overwriting any data in there **as long as any new missing data gets merged into master.txt before being moved**
Basically I need to perform a difference check, and write the differences that are not in archive into the master.txt, then archive them, cause there will be times when files have the same names in 2 folders, but 1 has slightly more data in it.
ASKER
I uploaded a full copy of my folder structure as a zip file here:
https://docs.google.com/leaf?id=0B0GcWwG3zysmZDk5ODJjMDItZWE2YS00Y2Q3LWIzNGUtZTljZWYzNmY4OGEw&hl=en
I cannot have this sit indefinitely on ee, so i had to use google docs.
https://docs.google.com/leaf?id=0B0GcWwG3zysmZDk5ODJjMDItZWE2YS00Y2Q3LWIzNGUtZTljZWYzNmY4OGEw&hl=en
I cannot have this sit indefinitely on ee, so i had to use google docs.
Try the following...
regards
Prashanth
regards
Prashanth
On Error Resume Next
archived_Path="c:\test\arc" 'archived folder path
temp_path="c:\test\temp" 'tempfolderpath
Master_file_path="c:\test\masterfile.txt" 'master file path
Set objFSO = CreateObject("Scripting.FileSystemObject")
If objfso.FileExists(Master_file_path) = False Then
objfso.CreateTextFile(Master_file_path)
End if
Set objFolder = objFSO.GetFolder(temp_path)
'WScript.Echo objFolder.Path
For Each Subfolder In objFolder.SubFolders
WScript.Echo Subfolder.Path & " - " & temp_path & " - " & archived_path
subfolder_ar=Replace(LCase(Subfolder.Path),LCase(temp_path),LCase(archived_Path))
WScript.Echo subfolder_ar
If objfso.FolderExists(subfolder_ar) Then
Set objsubFolder = objFSO.GetFolder(Subfolder.Path)
Set colFiles = objsubFolder.Files
For Each objFile In colFiles
WScript.Echo objfile.Path
Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
Do Until objFiletemp.AtEndOfStream
strlinetemp=objFiletemp.ReadLine
Lastline_temp = strlinetemp
Loop
objFiletemp.Close
objfilear_path=Replace(LCase(objfile.Path),LCase(temp_path),LCase(archived_Path))
WScript.Echo objfilear_path
If objfso.FileExists(objfilear_path) Then
Set objFilear = objFSO.OpenTextFile(objfilear_path, 1)
Do Until objFilear.AtEndOfStream
strlinear=objFilear.ReadLine
'WScript.Echo strlinear
Lastline_ar = strlinear
Loop
objFilear.Close
End If
WScript.Echo objfso.FileExists(objfilear_path)
If (StrComp(Lastline_temp,Lastline_ar)<>0) Or (objfso.FileExists(objfilear_path)=False) Then
WScript.Echo objfile.Path & " - in if strcomp"
Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
file_ar_readall=objfiletemp.ReadAll
objFiletemp.Close
Err.Clear
Set objMaster=objFSO.OpenTextFile(Master_file_path, 8)
objMaster.Write file_ar_readall
objMaster.WriteLine "-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-"
objMaster.Close
If Err.Number=0 Then
WScript.Echo "in move file 1 - " & Replace(LCase(objfilear_path),LCase(objfile.Name),"")
Err.Clear
objfso.copyFile objfile.Path,Replace(LCase(objfilear_path),LCase(objfile.Name),""),True
objfso.DeleteFile objfile.Path
If Err.Number<>0 Then
WScript.Echo Err.Number & Err.Description
End If
End If
Else
objfso.DeleteFile(objfile.Path)
WScript.Echo "in delete file 2"
End If
Next
Else
Set objsubFolder = objFSO.GetFolder(Subfolder.Path)
Set colFiles = objsubFolder.Files
For Each objFile In colFiles
WScript.Echo objfile.Path & " - in move folder copy file content"
Set objFiletemp = objFSO.OpenTextFile(objfile.Path, 1)
file_ar_readall=objfiletemp.ReadAll
objFiletemp.Close
Set objMaster=objFSO.OpenTextFile(Master_file_path, 8)
objMaster.Write file_ar_readall
objMaster.WriteLine "-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-"
objMaster.Close
Next
objfso.MoveFolder Subfolder.Path,subfolder_ar
WScript.Echo "in move folder 3"
End If
Next
Set objFolder = objFSO.GetFolder(temp_path)
For Each Subfolder In objFolder.SubFolders
If subfolder.Files.Count = 0 Then
WScript.Echo subfolder.Path & " - Delete folder"
subfolder.Delete
End If
Next
ASKER
great!!! 99% there, just one little thing if you could -
Can it only add the date and lines 1 time after processing all files (instead of for each file):
"-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;- ;-;-;-;-;- "
Can it only add the date and lines 1 time after processing all files (instead of for each file):
"-;"& Now &";-;-;-;-;-;-;-;-;-;-;-;-
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Absolutely perfect