Link to home
Start Free TrialLog in
Avatar of Member_2_79173
Member_2_79173

asked on

I need VBA code to extract hyperlinks and associated text from a Word document to a .csv file

One of our users is reporting that hyperlinks in a Word document they use keep changing. The hyperlinks are paths to folders on our file storage system. I wanted to extract all the hyperlinked paths so I can study this behavior. I found this page

https://www.extendoffice.com/documents/word/1411-word-select-copy-all-hyperlinks.html#a2

and used the VBA code to extract the underlying paths to a new document, but I need more than what this provides.

1. I would like VBA code to export the underlying paths to a .csv file in column A and I want the text which is associated with the respective hyperlink in column B.
2. The author of the code says it does not extract the hyperlinks in the same order as they occurred in the original document but I do want them in the original order.

Thanks,
Don
ASKER CERTIFIED SOLUTION
Avatar of Neil Fleming
Neil Fleming
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Also, if it's helpful, you could also output the position of the hyperlink in the document to Column C using:
Write #iFile, hThis.Address, hThis.Range.Text, hThis.Range.Start

Open in new window

Avatar of Norie
Norie

Don

How have the hyperlinks in the Word document(s) been created?
Avatar of Member_2_79173

ASKER

The user selects the text they want to hyperlink then right-clicks and chooses "Hyperlink" from the context menu then navigates in the wizard to the path where the document, usually another Word document or an Excel file lives. For example they end up with a UNC path something like this:

\\contoso.com\engineering\templates\how to make a widget.docx

Don
Is this what you need?
It shall be executed in the MS Word document that has the links you want to export.
Then open the CSV file in MS Excel, and hyperlink will be in column A, and the assocciated text in column B.
You will find the CSV file in the same folder as the MS Word document.
Need more advice? Post a comment about it.
Just a small modification of some code found in you link.

Sub HyperlinksExtract()
    Dim oLink As Hyperlink
    Dim docCurrent As Document 'current document
    Dim docNew As Document 'new document
    Dim rngStory As StoryRanges
    Dim strTargetCSV As String
    Set docCurrent = ActiveDocument
    Set docNew = Documents.Add
    strTargetCSV = docCurrent.Path & "\link.CSV"
   
    Open strTargetCSV For Output As #1
   
    For Each oLink In docCurrent.Hyperlinks
        oLink.Range.Copy
        docNew.Activate
        Selection.Paste
        'Debug.Print oLink.Address & "  " & oLink.TextToDisplay
        Print #1, oLink.Address & ";" & oLink.TextToDisplay             ' Change order here to suit your needs.
        Selection.TypeParagraph
    Next
   
    Close #1
    Set docNew = Nothing
    Set docCurrent = Nothing
End Sub
Zeth,

I think there is a small bug. It works as expected but the path and the text are both in column A with a semicolon ( ; ) between the path and the text from the Word document. I could probably parse this in Excel but can you see how to fix this in the code?

Thanks
Don
Don

I think Zeth's code only needs some minor adjustment, try this.
Sub HyperlinksExtract()
Dim docCurrent As Document    'current document
Dim oLink As Hyperlink
Dim strTargetCSV As String

    Set docCurrent = ActiveDocument
    strTargetCSV = docCurrent.Path & "\link.CSV"

    Open strTargetCSV For Output As #1

    For Each oLink In docCurrent.Hyperlinks
        Print #1, oLink.Address & "," & oLink.TextToDisplay
    Next

    Close #1
    
    Set docCurrent = Nothing
    
End Sub

Open in new window

Hi Norie,
 I have left work for the day but I will try this tomorrow and let you know.
Thanks,
Don
hmm.. how is this any different from the solution I posted before any of this...?

The only difference I can see is that my code guarantees export of the links in the correct order, whereas later efforts do not.
Comma or semicolon separation, I belive is a matter of country version of MS Excel. "," did not work for me. It become one single string of both data.
I'l remeber to point on that next time.
Neil, if you code work as donander want to, I'v not tryed it, then the solution is yours, of course.
I forgot to post the results of Neil's code. When I run it I get the error

"Can't delete old file. It may be open"

Neil, can you see what might be wrong?

Thanks,
Don
Did you change the file path and name to a path that exists on your machine?

I wrote that error message for any error that might occur, somewhat casually, since the most likely error is that you create the csv file, open it, and then try to run the code again.

If you comment out the "onError" line, it should tell you what the actual error is.
Yes, I changed the path on the line below as shown. The Word document is in this same folder.

sPath = "d:\snoopdon\test.csv"

When I remark the "OnError" I get:

Run-time error '53':
File not found

If I click Debug it highlights "Kill sPath" in the line below in yellow.

If sPath <> "" Then Kill sPath

Don
ah, sorry. Stupid mistake in my code. It was trying to delete sPath instead of sFile. Try this instead:
Sub getHyperlinks()
    Dim hLink As Hyperlink, hThis As Hyperlink
    Dim docCurrent As Document 'current document
    Dim sPath As String, sFile As String
    Dim iFile As Long, iMin As Long, iMax As Long
    
    On Error GoTo errortrap
    'change this to your desired filename
    sPath = "c:\neil work\test9.csv"
    iFile = FreeFile
    sFile = Dir(sPath)
    'delete old file
    If sFile <> "" Then Kill sFile
    Open sFile For Output As #iFile
    
    Set docCurrent = ActiveDocument
    iMin = 0
    Do
    Set hThis = Nothing
    iMax = docCurrent.Range.End
    'find hyperlinks in order
    For Each hLink In docCurrent.Hyperlinks
    If (hLink.Range.Start < iMax) And (hLink.Range.Start > iMin) Then
    Set hThis = hLink
    iMax = hThis.Range.Start
    End If
    Next
    
    If Not (hThis Is Nothing) Then
    Write #iFile, hThis.Address, hThis.Range.Text
    'redefine imin
    iMin = iMax
    End If
    
    Loop Until hThis Is Nothing
    Close #iFile
    Exit Sub
errortrap:
MsgBox ("Can't delete old file. It may be open")
    
    
    
End Sub

Open in new window

Ok, I am still getting "Can't delete old file. It may be open." when I run the macro.

If I remark "OnError..." I see this:

Run time error '75':
Path/File access error

If I click the Debug button it highlights the line below in yellow. At least it is a different line :-)

Open sFile For Output As #iFile

I did change

sPath = "c:\neil work\test9.csv"

to

sPath = "d:\snoopdon\test9.csv"

The document is in d:\snoopdon.

Thanks,
Don
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thank you, Zeth.. and my apologies for writing the code in such a hurry not to have checked the file opening stuff properly.

In my haste, I changed the first version too quickly and broke it again. This will also work fine:
'change this to your desired filename
    sPath = "c:\neil work\test9.csv"
    iFile = FreeFile
    sFile = Dir(sPath)
    'delete old file if it exists
    If sFile <> "" Then Kill sFile
    Open sPath For Output As #iFile

Open in new window

It's 1-1 between us! Norie helped me :-)
If it works now, it schould yet be your nice code that get the solution!
lol.. too embarrassed to accept them :)
After making the edit in the code that Zeth suggested, the code now works exactly as I had requested.

Thanks much to Zeth and Neil for providing a great solution and for collaborating together to achieve it.

This is what sets Experts Exchange apart from all the forums on the web where there is virtually no collaboration.

Don
Pleasure. Apologies again for the stoopid mistakes..
Stupid mistake!? Yet we are making all human mistakes - thank goodness:-)