How do I format simple HTML tags inside a Word document?


I have a Word Document with a Macro which queries a SQL Server database, returning text fields with HTML tags. I would like for the HTML formatting to appear in the Word Document.  It is simple tags, <b> and <i> and &nbsp; and &reg; and that's about it.

How do I get <b>asdf</b> to show up as a boldfaced "asdf" in the Word document?  So far, Word is displaying the HTML tag text, "<b>asdf</b>", with no formatting applied.

Thanks for any ideas.
Who is Participating?
Jason, you're losing your formatting when you put it into the string, instead try

then selection.paste after selecting where you what it (the cell or what ever)
Eric FletcherCommented:
Jason: Turn on the checkbox for Use wildcards in Find and Replace. In the Find what box, type "(\<b\>)(*)(\</b\>)" and "\2" in the Replace with (don't include the quotes). Then press Ctrl-b to set the format in the replace what to bold. When you click Replace All, it will find any string starting with <b> and ending with <\b> and replace it with the same string minus the html tags but in bold. The parentheses set groups; the \n in replace with refer to the group by its sequence in the find what.

If you have other html tags, create a macro to do it all at once. Be sure to turn off the wildcard option at the end so you don't inadvertently leave it set and mess up subsequent finds.
i don't think there's any easy way to do what you want. Could you save the text fields as html files? then you could open them in word and it'd do the conversions. Otherwise, some sort of parsing macro like this:

Sub q()
Dim frmName As String
frmName = "<B>" 'the bold character
Selection.HomeKey unit:=wdStory, Extend:=wdMove
Selection.Find.Execute findtext:=frmName, Forward:=True, MatchWholeWord:=False, Wrap:=wdFindStop
Do Until Selection.Find.Found = False   'find all the bolds
    Selection.Delete                    'replace format mark with bookmark
    Selection.Bookmarks.Add Name:="boldStart"
    frmName = "</" & Right(frmName, 2)  'find the end of formatting
    Selection.Find.Execute findtext:=frmName, Forward:=True, MatchWholeWord:=False, Wrap:=wdFindStop
    Selection.Delete                    'replace end format mark with bookmark
    Selection.Bookmarks.Add Name:="boldEnd"
    Selection.GoTo What:=wdGoToBookmark, Name:="boldStart"
    With Selection          'select the text between the bookmarks
        .Collapse Direction:=wdCollapseStart
        .ExtendMode = True
        Selection.GoTo What:=wdGoToBookmark, Name:="boldEnd"
        .ExtendMode = False
    End With
    Selection.Font.Bold = True          'bold it
    Selection.HomeKey unit:=wdStory
    frmName = "<B>"                     'find if there is another bold
    Selection.Find.Execute findtext:=frmName, Forward:=True, MatchWholeWord:=False, Wrap:=wdFindStop
End Sub

would work for simple ones like <b> and <i>  (i don't remember what &req &nbsp do anymore) .  

Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

Eric FletcherCommented:
A bit of clarification to my earlier comment...

The "\" character is necessary before the "<" and ">" because both these characters have special meanings in a wildcard search (beginning and end of words). See Word's help for a more complete rundown on how to use wildcards. /Eric
jasonwisdomAuthor Commented:
Eric -

What does the "\2" mean?  When I tried it on <i>..</i>, it just removed the <i> and </i> tags.  It did not make the selection italics or boldface.

I am about to try the save as HTML file and then reload into Word.  I am thinking of something like this:

FSO - save file as textfile with extension .htm
Create a new Word document in my VBA
Open the .htm file into the new Word Document
Make the Selection All
Copy and Paste into my original Word Document

Something like that?

Thank you both for your help.
jasonwisdomAuthor Commented:
Gilbar -

I am looking at saving the file as an .htm file then opening it in Word and having Word do the conversion.  Here is what I came up with so far...

                Documents.Open ("WordTemp.htm")
                i = 1
                While i <= Documents.Count
                    If Len(Documents.Item(i).Content.Text) < 100 Then
                        strProduct = Documents.Item(i).Content.Text
                    End If
                    i = i + 1
                rowNew.Range.Cells.Item(2).Range.Text = strProduct

And the HTML tags are stripped out.  However, the formatting is lost as my <i>..</i> text is not italicized.

Any ideas?

jasonwisdomAuthor Commented:
Try this code block instead:

                Documents.Open ("WordTemp.htm")
                strProduct = Documents("WordTemp.htm").Content
                Documents("WordTemp.htm").Close SaveChanges:=wdDoNotSaveChanges
                rowNew.Range.Cells.Item(2).Range.Text = strProduct

The code is simpler, but the result is the same:  no italicized text, although the HTML tags have been removed.
Eric FletcherCommented:
Jason: The "\2" in the Replace with represents the 2nd group from the Find what part of the F&R dialog -- in this case, the "(*)" part which is whatever is between the html tags.

If you lost the html tags but didn't see any bold, you probably didn't have the format set to bold in the Replace with part of the dialog (be sure "Font: Bold" appears). If you do it manually per my instructions, it will definitely work.

However, if you recorded a macro to do it, you'll need to add some extra lines to manage the formatting part of the replace. For some reason, Word doesn't seem to record that part of the dialog! Here is what gets recorded with my added lines as indicated:

Sub Macro2()
    With Selection.Find
        .Text = "(\<b\>)(*)(\</b\>)"
'-- set bold for the find part to false (not essential but good practice)
        .Font.Bold = False
        .Replacement.Text = "\2"
'-- set the format for the replacement text to bold
        .Replacement.Font.Bold = True
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchAllWordForms = False
        .MatchSoundsLike = False
        .MatchWildcards = True
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
End Sub

Note that this leaves the F&R box 'loaded' with the options youv'e set up: if you are doing a macro, it would be a good idea to end it by resetting the F&R options (clear formatting, turn off wildcards...)

Some time ago, I set up a somewhat similar macro to clean up HTML text for a very specific project. It used the approach above to convert internal formatting html tags (bold, italic, strong) as well as change paragraphs with bullets and heading level styles into Word format. I could probably dredge it up from some backup but it would have to wait until after tax time (Apr 30 in Canada!). This should give you a good start on getting the same thing.
jasonwisdomAuthor Commented:
Thank you, both Eric and Gilbar.

Creating a new document and mixing the Selection.WholeStory, .Cut and .Paste worked.  It converted <b>, <i>, &reg; and <img src="http://"> into display in my original Word Document.  And if any new tags (<u> for example) appear later, I won't have to recode in order to get it to work.

The other 2 ideas were very, very helpful, and through them I feel confident I could have "hacked" everything except for the <img src> tags through that.  But this works much better.

I GREATLY appreciate your time!!!

you're welcome jason!
now humor a person who is obviously going senile and remind me what &req &nbsp do.  I used to know, back in the twentyth century :)
jasonwisdomAuthor Commented:
me too, I haven't done HTML since 1999!

&nbsp; is non-breaking space.  it's a " ".
&reg; is a Registered symbol.  The ® symbol.
There's one for TM as well...
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.