Avatar of eklin
eklin
Flag for United States of America asked on

Identifying Excel Files that reference other files

Hi -

I'm trying to identify all Excel files within a folder and its sub-folders that have formulas / references to other files.  This is needed in order to identify the impact of renaming the folder structures and moving the files.  Anyone has any suggestions on how to create an automated report that lists all files that would be impacted? i.e references might break if we moved all files within that folder around?  Thanks in advance!
Microsoft Excel

Avatar of undefined
Last Comment
eklin

8/22/2022 - Mon
dlmille

Excel files that have references to other files also have LINKS to those other files.  This solution that is in progress also does what you're looking for.  Please take a look and see if the solution suits your needs, or we can customize your solution starting from this, in this thread:
https://www.experts-exchange.com/Microsoft/Applications/Q_27721498.html

My solution is the last post in that thread.  Download the file in a trusted location (where Excel can run macros from) and give it a look-see.  You'll be prompted for a folder, then it will identify links (references to other files) for every Excel file found in that folder.

Cheers,

Dave
eklin

ASKER
Dave -

Thanks for posting and apologies for the delay in responding.  The file works great but 2 things are missing:

1. it doesn't recurse to sub-folders
2. it doesn't identify outside references that are present in modules/vba code within the file.

How difficult would it be to edit it to accomodate this requirement?
dlmille

>>1. it doesn't recurse to sub-folders

Do-able...

>>2. it doesn't identify outside references that are present in modules/vba code within the file.

Not sure what you mean by this.  Please clarify
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
eklin

ASKER
for #2, I mean there is vba code as part of a button in many files that is used to open files in different locations that are hard-coded in the code.  For instance, in your control tab, you have a button that prompts user for the location of the file.  In many of the files I have, the code doesn't prompt for location but instead has the c:\tmp\etc... hard coded.  I would like to be able to identify those instances so I can go back and fix the locations once I change the directory tree.
dlmille

That second part is akin to guessing at where those fields may be.  The VBA code would have to scan every cell that has data to determine if that was a file location or not.  I'm not sure that's a doable.  Do you really want to search each file, every tab, every cell that has data?  This could take some time to process.

Here's the solution with recursive logic to subfolders:
Option Explicit

Sub scanIdentifyLinks()
Dim wkb As Workbook
Dim wks As Worksheet
Dim wkbIn As Workbook
Dim aLink As Variant
Dim strPath As String
Dim fName As Variant
Dim dialogFile As FileDialog
Dim iFile As Long
Dim aLinks As Variant
Dim iLink As Long
Dim fCol As New Collection

    Application.ScreenUpdating = False
    Set wkb = ThisWorkbook
    Set wks = wkb.Worksheets("Output")

    wks.Cells.Clear
    wks.Range("A1:B1").Value = Split("Path\Filename,Links that exist ->", ",")

    'prompt for path to explore
    strPath = ThisWorkbook.Path & "\"
    Set dialogFile = Application.FileDialog(msoFileDialogFolderPicker)
    With dialogFile
        .AllowMultiSelect = False
        .InitialView = msoFileDialogViewDetails
        .InitialFileName = strPath
        .Title = "Select Folder for Link Analysis"
        .Show
    End With
    If dialogFile.SelectedItems.Count > 0 Then
        strPath = dialogFile.SelectedItems(1)

        Application.EnableEvents = False
        Call getFiles(strPath, "*.XLS*", fCol, True)
        
        'fName = Dir(strPath & "\*.xls*")    'check all files in folder
        On Error Resume Next
        For Each fName In fCol 'Do While fName <> vbNullString
            wks.Range("A2").Offset(iFile, 0).Value = fName
            Application.StatusBar = "Processing file: " & fName
            
            'open each file
            Set wkbIn = Workbooks.Open(Filename:=fName, UpdateLinks:=False, ReadOnly:=True)
            If Err.Number <> 0 Then
                wks.Range("A2").Offset(iFile, 1).Value = "Unable to open file, moving on..."
                wks.Range("A2").Offset(iFile, 2).Value = "Err: " & Err.Number & "-> " & Err.Description
            Else
                'get links
                aLinks = wkbIn.LinkSources(xlExcelLinks)
                'output links
                If IsEmpty(aLinks) Then
                    wks.Range("A2").Offset(iFile, 1).Value = "No Links Found"
                Else
                    For iLink = LBound(aLinks) To UBound(aLinks)
                        wks.Range("A2").Offset(iFile, iLink).Value = aLinks(iLink)
                    Next iLink
                End If
            End If
            wkbIn.Close savechanges:=False
            Err.Clear
            iFile = iFile + 1
            'fName = Dir()
        Next fName

    End If

    MsgBox "Process Complete!"
    
gracefulExit:
    
    Set dialogFile = Nothing
    Set wkbIn = Nothing
    
    Application.EnableEvents = True
    Application.ScreenUpdating = True
    Application.StatusBar = False
    
End Sub
'below is code for changing links from old path to new path, and old workbook name to new workbook name
Function changeLink(myWkb As Workbook, oldPath As String, newPath As String, oldName As String, newName As String) As Boolean
Dim aLinks As Variant
Dim oldLink As String
Dim newLink As String
Dim i As Long
    
    aLinks = myWkb.LinkSources(xlExcelLinks)
    If Not IsEmpty(aLinks) Then
        For i = 1 To UBound(aLinks)
            oldLink = aLinks(i)
            newLink = Replace(oldLink, oldPath & "\", newPath & "\")
            newLink = Replace(newLink, oldName, newName)
            If oldLink <> newLink Then
                myWkb.changeLink Name:=oldLink, newName:=newLink
            End If
        Next i
    End If
    changeLink = True
End Function
Sub getFiles(strPath, strFilter As String, fCol As Collection, bSubFolders)
Dim FSO As Object
Dim fldr As Object
Dim subFldr As Object
Dim fName As Object

    Set FSO = CreateObject("Scripting.FileSystemObject")
    Set fldr = FSO.GetFolder(strPath)
    
    For Each fName In fldr.Files
        If UCase(getFileExt(fName)) Like strFilter Then
            fCol.Add fName
        End If
    Next fName
    If bSubFolders Then
        For Each subFldr In fldr.subfolders
            Call getFiles(subFldr, strFilter, fCol, True)
        Next subFldr
    End If
    
End Sub
Public Function getFileExt(fName As Variant) As String
Dim i As Integer
    i = InStr(StrReverse(fName), ".")
    getFileExt = StrReverse(Left(StrReverse(fName), i))
End Function

Open in new window


See attached.

Dave
scanFolderIdentifyLinks-r2.xls
eklin

ASKER
Thanks Dave - you have a good point regarding #2.  Some files might have password protected that I don't know the password.  Is it easy to edit the code to skip password protected files (and log their path/name) if you don't get the password right in the frst 2 attempts?
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER CERTIFIED SOLUTION
dlmille

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
eklin

ASKER
Wow! This is great Dave - you really helped me out here.  Job well done.