Solved

MS word bat script to determine if a file has track changes on and if there are changes

Posted on 2014-07-31
4
202 Views
Last Modified: 2016-02-10
Hi Experts, I have about 50,000 word documents sitting on a file server as part of an application that I support.  There was an issue where some of the files - no telling how many - we're saved with track changes on and unaccecpted changes to the document (I.e., the red text).  I need to write a program that will examine each file and determine if A) track changes is enabled, B) if there are any unaccecpted changes, C) when the file was last modified, D) who modified it.

I'm thinking that this information is stored in the metadata of the file somewhere, or at least in the file properties.  I believe a batch program would be the easiest way to do this with the output going to a .txt or .csv file.  I would actually be more comfortable if there were a VBA program that I could run out of Excel that could do the analysis ( I'm pretty good with VBA)

Any suggestions would be much appreciated - I'm in kind of a crunch here.

- Jamey
0
Comment
Question by:JamesCbury
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 

Author Comment

by:JamesCbury
ID: 40234463
for anyone tracking this thread I think I might have figured something out... please comment if there are issues here.   to get to the track changes data I had to open the file, pull the attributes, close the file.  It does what I want it to, but is a bit clunky...
Sub Check_track_changes()

    'Clear the existing data
    Range([C7], Cells([C7].End(xlDown).Row, 11)).ClearContents
    
    'set the directiory folder
    sFolder = ActiveSheet.Range("FD_Path")
    'set the doc types in scope
    sExt = Array(".doc", ".docx", ".docm")
    
    Set oFSO = CreateObject("Scripting.FileSystemObject")
    Set oFolder = oFSO.GetFolder(sFolder)
    
    Set oShell = CreateObject("Shell.Application")
    Set oNamespace = oShell.Namespace(sFolder)
    
    Set wApp = CreateObject("Word.Application")
    
    iRow = 7
    For Each oFile In oFolder.Files
        For E = 0 To UBound(sExt)
            If LCase(Right(oFile.Name, Len(sExt(E)))) = sExt(E) Then
                'log the open time *Debugging purposes*
                Cells(iRow, 9) = Now()
                'Update the progress
                [D3] = "Now Processing File " & iRow - 6 & " of " & oFolder.Files.Count
                
                'write the file name(from metadata)
                Cells(iRow, 3) = oFile.Name
                'Open the file to get attributes
                'wApp.Visible = True
                On Error GoTo Bad_File_Err
                Set myDoc = wApp.Documents.Open(Filename:=oFile.Path, ReadOnly:=True)
                On Error GoTo 0
                
                'Check if track changes is on
                If myDoc.TrackRevisions = True Then
                    Cells(iRow, 4) = "On"
                ElseIf myDoc.TrackRevisions = False Then
                    Cells(iRow, 4) = "Off"
                End If
                'Get the count of changes
                Cells(iRow, 5) = myDoc.Revisions.Count
                'Get the count of comments
                Cells(iRow, 6) = myDoc.Comments.Count
                'Close the file **DO NOT SAVE CHANGES**
                myDoc.Close SaveChanges = False
                'Application.ActiveDocument.Close SaveChanges:=wdDoNotSaveChanges
                'Get the date modified (from metadata)
                Cells(iRow, 7) = FormatDateTime(oFile.DateLastModified, 2)
                
                'log the close time *Debugging purposes*
                Cells(iRow, 10) = Now()
                Cells(iRow, 11) = Cells(iRow, 10) - Cells(iRow, 9) 'calculate the cycle time
                iRow = iRow + 1
Resume_Files:
            End If
        Next E
    Next oFile
    [D3] = ""
Exit Sub

Bad_File_Err:
    Cells(iRow, 4) = "Cannot open this file - check for corruption"
    Cells(iRow, 9) = "Failed"
    GoTo Resume_Files
    
End Sub

Open in new window

0
 
LVL 24

Accepted Solution

by:
Phillip Burton earned 500 total points
ID: 40499920
Track Changes is not part of the metadata - see http://www.filecats.co.uk/metadata-extended-document-properties/microsoft-office/ for details of Word's metadata. You can certainly get Author and last saved as part of the metadata, and if you are in a time crunch, Filecats Professional might be a good idea.

The only way to ascertain if Track Changes is on is to open each of the files like James is doing. Do bear in mind that, if you do that, it MAY change some of the time metadata such as Last Accessed (emphasis on MAY), even if not saved. So I would suggest cataloguing in Filecats Professional first (which doesn't affect such data), and then doing James' solution, and combining the two.
0
 

Author Closing Comment

by:JamesCbury
ID: 40622086
Thanks Phillip - that's good advice.  I will make that update to my program for future use.
0
 
LVL 35

Expert Comment

by:gr8gonzo
ID: 41459298
It may not be part of the official metadata, but Track Changes IS a saved flag. I'm not certain how this would be read in an older .doc Word file, but it's pretty easy in .docx. All of the newer Office formats that end in "x" are actually ZIP files that contain a variety of resources, so it's very easy to discover in the XML.

Basically, you'd write a script that unzipped the file (some 3rd party libraries might let you just decompress a single file into memory, which would be even faster). In the resulting "word" subfolder, you'll have 2 files that have the info you want - document.xml and settings.xml.

If Track Changes is enabled, then the settings.xml file will contain this text:
<w:trackRevisions/>

If there are revisions, then the document.xml file will contain a tag that starts with <w: and indicates the operation, like "<w:del", and it will have a "w:date" attribute, and a "w:author" attribute, like this:

<w:del w:id="1" w:author="Fooey Barress" w:date="2016-02-10T20:42:00Z">

You can use Xpath syntax to search for such tags using most major XML libraries.

The older pre-2010 formats use a proprietary binary format that would require a separate library to read, assuming the Office SDK doesn't give you the necessary info.

In any event, reading the data like this should give you a good idea without risking any modifications to the file AND it has the added bonus of not requiring an instance of Word running in memory to open/close all that stuff.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In Part II of this series, I will discuss how to identify all open instances of Excel and enumerate the workbooks, spreadsheets, and named ranges within each of those instances.
Cancel future meetings from user mailboxes in Office 365 using Remove-CalendarEvents
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

626 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question