We help IT Professionals succeed at work.

Word macro changing style format?

I am using some Microsoft VB code for a macro to convert text links in a Merge-created 200-page document to clickable hyperlinks. The problem I have is that when the macro runs, it also changes some other styles. Is there an easy way to modify the Range to include only hyperlinks?

Sub ConvertURLTextsToHyperlinksInDoc()
  Dim objDoc As Document
 
  Set objDoc = ActiveDocument
 
  Word.Options.AutoFormatReplaceHyperlinks = True
  objDoc.Range.AutoFormat
End Sub

Open in new window

Comment
Watch Question

John KorchokProduction Manager

Commented:
I imagine you're using the suggestion from this page: 5 Ways to Convert URL Texts to Hyperlinks in Your Word Document.

As you've discovered, AutoFormat doesn't run one option at a time. Instead it takes all your current AutoFormat options and throws them at the target document. You can still use AutoFormat, if you take the time to set all the options in File>Options>Proofing>AutoCorrect Options>AutoFormat so that they have the effect you prefer before running the macro.

But you'll get a more focused effect if you run a macro that just does this job:
Sub Text2HLink()
    Dim oRng As Range
    Set oRng = ActiveDocument.Range
    With oRng.Find
        Do While .Execute(FindText:="http:", Forward:=False) = True
            oRng.MoveEndUntil (" ")
            ActiveDocument.Hyperlinks.Add Anchor:=oRng, Address:=oRng.Text, TextToDisplay:=oRng.Text
            oRng.Collapse wdCollapseEnd
        Loop
    End With
End Sub

Open in new window

Most Valuable Expert 2013

Author

Commented:
Hmmm, this gets me close, but I'm running into a few different issues:
1) I have both http: and https: links in the document, so does it make sense to just FindText:="http"?
2) When I run the macro, most links are fine, but some change formatting to other font/size
3) In one location the macro generated an entire page of repeated link text without actually hyperlinking
4) Program crashes with Run-time error '4198" and program failed.

p.s. yes, you are correct on the source ;)
John KorchokProduction Manager

Commented:
1. Change the Find Text string from "http:" to "http" (removing the colon). Then it will find both.
2. In many Word documents, most users start with an underlying style, then use local formatting to change its appearance. But when you run an automatic process on it, the underlying style is used to format the text and the appearance changes. In the context of your issue, this isn't easily solved. Any macro-driven update will cause a similar phenomenon.
3. That's interesting. I would like to see what conditions cause the repetition, if you can share a document that doesn't reveal any proprietary information.
4. 4198 is associated with links, but is not specific. Does it occur after the entire page of repeated link text in #3?
Most Valuable Expert 2013

Author

Commented:
I don't think this approach is going to work. There's several other issues that also occur:
5) In some cases, only the "http" letters in the link become hyperlinked (hyperlinked to http, which clearly doesn't go anywhere).
6) Any link with spaces instead of %20 in a document name gets hyperlinked to a link that only goes to the first space, so the link fails.
7) In some instances, the first word of the next page is also hyperlinked (using the prior link), despite separation with a paragraph break AND a section break.

Regarding your comments above:
2. True. However, these pages are run from a merge (from an Excel spreadsheet), so within each page, the underlying style should be exactly the same for the section that contains link text. I have done my best to clean the contents of the spreadsheet of hidden or formatting characters by converting the original spreadsheet to a .CSV file, closing it, then re-opening the CSV and saving it back as a .XLSX file. This resolved other merge issues I was having.
3) Actually, I found that there's several locations where the repeating link text occurs. Unfortunately, the document is proprietary. I could create an Excel spreadsheet with non-proprietary information, but since I have no clue what triggers those specific links or locations to generate the repeats, I don't know if I could make it happen again. Certainly nothing about the specific links where it happens stands out - no special characters in the link itself.
4) I am not sure where it occurs. As I mentioned, it seems to generate this pattern in more than one location, and the amount of repeats does not seem to be consistent across locations (but is within location, if I run the merge again, and run the macro on the new merge file, I get the same results on the same pages of the new document).
John KorchokProduction Manager

Commented:
Without seeing the source material you're working, I cant' suggest alternate code for the macro. A more complex macro would be able to find a set of endings for your URLs like /, .htm, .html,  .asp, .com, etc. Then it would replace all spaces with %20 entities.

The Excel formatting has no influence on the mailmerge appearance, it's just a data source. Any formatting issues will arise from the Word style settings in the host document. If you click on a section of text and Normal style is highlighted in the Style Gallery, you should be able to highlight that text and apply  Normal and not see any changes. If the text changes when you do this, local formatting has been applied that will mess up your AutoFormat technique.

Often, when VBA crashes, you'll see a dialog with a Debug button on it. Click that debug button and it will open the VBE with the last executed statement highlighted. That statement may be the cause of the problem, or the problem may have started earlier and the highlighted line shows only where Word hit a brick wall. 4198 is often associated with hyperlink issues, but it's not very specific.
Most Valuable Expert 2013

Author

Commented:
The line that appears when the debug button is clicked is:
ActiveDocument.Hyperlinks.Add Anchor:=oRng, Address:=oRng.Text, TextToDisplay:=oRng.Text

Open in new window


Not sure that's very diagnostic, since it's in a repeating loop which seems to work most of the time.

>> If the text changes when you do this, local formatting has been applied that will mess up your AutoFormat technique.
Correct. And prior to running the macro, all link text is formatted as Normal. I'm unable to find an initial formatting difference between the links that retain the base font and size and the ones that don't when the macro is run. This is different behavior than that of the original macro as posted in my Q text above, where all of the links were consistent in formatting, but other formatting changed (mostly by changing what was being applied by Style, although not always).
Do you need to use VBA for this? With Word’s Find All, you can select all hyperlinks, then perform an action on all of them at once (either as a Replace, or drop out to apply formatting or copy them to paste them elsewhere).

1. The key is to make the hyperlinks visible first: press Alt-F9 to toggle visibility of field codes.

2. Bring up the Find dialog (the now-default Ctrl-F FInd in the navigation bar won't do it any more; select "Advanced Find" from there or use F5 and click the Find tab).

3. In the Find what box, type (or copy) ^d HYPERLINK (you can get the ^d by selecting "Field" from the "Special" button's pop-up list, but you'd need to click "More >>" to show other search options to see it).

4. Click "Find In" and choose "Main Document". All instances of hyperlink field codes will be selected.

If you drop out of (or close) the Find dialog, you can then apply any formatting you want to the selection. You can also use Ctrl-C to copy all instances so you can paste them elsewhere to get a full list of all hyperlinks for use in an external URL validation tool if you wish — or simply to make it easier to review manually.

Remember to use Alt-F9 again to toggle the view back to field code results.

What is not well-explained in most documentation about Word’s Find feature is that step 3 above will select the entire HYPERLINK field code along with all of its switches, and not just the typed part you might expect for "normal" find phrases. You have encountered this in the code from John: the FindText:="http:" part limits the phrase to just that string of characters, so it would find instances outside of a hyperlink field code as well.

What is even more useful is that you can further refine what types of hyperlinks to find by extending the search phrase. For example, ^d HYPERLINK "mailto in the Find what box will find only email address type hyperlinks; ^d HYPERLINK "http: will find just the http type URLs (i.e. ones that don't yet conform to the https protocol); and ^d HYPERLINK  \l will find hyperlinks to internal bookmarks in the same document.

I use the above technique to then use "Replace with" to apply different custom character styles to the different types of hyperlinks when I am formatting research papers for editing. That way, a colour (or font) can more easily differentiate between the various types of hyperlink. Any hyperlink that still has the default "blue underline" Hyperlink style can then be checked individually to see why it wasn't tagged with the characteristic identifier from the above procedure. (You can easily set up a macro to do this; VBA if you prefer & need more options, but even a recorded one works fine.)

Tip: Toggle the non-printing characters (¶ or Ctrl-*) so you can see where spaces will be needed. For some reason, the internal bookmarks switch needs 2 spaces before the backslash. Some field codes entered by 3rd party tools or manually may not include the space after the ^d. I see this for XE field code for some mark index entries that have been edited, but also some HYPERLINK field codes that come in from content copied from non-Word sources.
Most Valuable Expert 2013

Author

Commented:
Eric, thanks for the thought. When I try your technique, it doesn't find any hyperlinks in my merged documents. I'm not merging the hyperlinks as links, because they are embedded in other textual information, so I don't use HYPERLINK in the merge process (via something like { HYPERLINK “https://importantlinkhere.com” } or similar). Instead, I am including { MERGEFIELD "myfieldname" } where myfieldname is a block of text from the Excel source which may or may not include a URL. My goal is, for each merged page that includes text where a URL is part of the text, to make those URLs into clickable links.

As stated, the original code works well for this, except for the formatting changes that also take place. My hope was (based on this documentation: https://docs.microsoft.com/en-us/office/vba/api/word.range.hyperlinks) that there was a way to limit Range to the URLs.
John KorchokProduction Manager

Commented:
Using word.range.hyperlinks would be a good idea, but they aren't hyperlinks yet, and so not members of that collection.
Got it; I overlooked that part of your question! So you need to apply a style (or otherwise format) any valid URL found within the resulting Word document after the mail merge?

Further to what I included above, it is possible to use the more limited “wildcard” capabilities of Word’s Find and Replace feature to Find and apply formatting to straightforward URLs — but punctuation within content can often cause mistakes unless you are careful to consider them and include them in a procedure.

Since I needed to do this regularly as part of contracts where I was formatting research papers with Word 2010, I prepared the following multi-step procedure. 4-step Find and Replace procedure to set URLs in red font
The above procedure is from Word documentation notes I prepared >10 years ago. However, the screen shots below confirm that it still works. Here's sample text before I performed the above procedure today:Sample content before procedure
And here it is after performing the 4-step procedure using the current version of Word 365:Sample content after procedure
The procedure can easily be recorded as a basic macro (and then modified for better efficiency and/or to handle more complex parsing with VBA).

As you probably know, parsing for any valid URL is complex due to the expanding Uniform Resource Identifier standards to accommodate Unicode for accented letters and other language script, but also for content after ? or other escapes. This Stack Overflow thread started in 2009 discusses the topic, and includes a number of RegEx expression tips to handle things. If you are familiar with RegEx, an editor that uses it may give you more options than what I've described above for a Word solution.
Most Valuable Expert 2013
Commented:
It looks like the answer is that what I had hoped to do was not possible. Thanks for the alternative approaches.