Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win


How do I format simple HTML tags inside a Word document?

Posted on 2004-04-23
Medium Priority
Last Modified: 2012-06-27

I have a Word Document with a Macro which queries a SQL Server database, returning text fields with HTML tags. I would like for the HTML formatting to appear in the Word Document.  It is simple tags, <b> and <i> and &nbsp; and &reg; and that's about it.

How do I get <b>asdf</b> to show up as a boldfaced "asdf" in the Word document?  So far, Word is displaying the HTML tag text, "<b>asdf</b>", with no formatting applied.

Thanks for any ideas.
Question by:jasonwisdom
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
  • 3
LVL 21

Expert Comment

by:Eric Fletcher
ID: 10918661
Jason: Turn on the checkbox for Use wildcards in Find and Replace. In the Find what box, type "(\<b\>)(*)(\</b\>)" and "\2" in the Replace with (don't include the quotes). Then press Ctrl-b to set the format in the replace what to bold. When you click Replace All, it will find any string starting with <b> and ending with <\b> and replace it with the same string minus the html tags but in bold. The parentheses set groups; the \n in replace with refer to the group by its sequence in the find what.

If you have other html tags, create a macro to do it all at once. Be sure to turn off the wildcard option at the end so you don't inadvertently leave it set and mess up subsequent finds.

Expert Comment

ID: 10918705
i don't think there's any easy way to do what you want. Could you save the text fields as html files? then you could open them in word and it'd do the conversions. Otherwise, some sort of parsing macro like this:

Sub q()
Dim frmName As String
frmName = "<B>" 'the bold character
Selection.HomeKey unit:=wdStory, Extend:=wdMove
Selection.Find.Execute findtext:=frmName, Forward:=True, MatchWholeWord:=False, Wrap:=wdFindStop
Do Until Selection.Find.Found = False   'find all the bolds
    Selection.Delete                    'replace format mark with bookmark
    Selection.Bookmarks.Add Name:="boldStart"
    frmName = "</" & Right(frmName, 2)  'find the end of formatting
    Selection.Find.Execute findtext:=frmName, Forward:=True, MatchWholeWord:=False, Wrap:=wdFindStop
    Selection.Delete                    'replace end format mark with bookmark
    Selection.Bookmarks.Add Name:="boldEnd"
    Selection.GoTo What:=wdGoToBookmark, Name:="boldStart"
    With Selection          'select the text between the bookmarks
        .Collapse Direction:=wdCollapseStart
        .ExtendMode = True
        Selection.GoTo What:=wdGoToBookmark, Name:="boldEnd"
        .ExtendMode = False
    End With
    Selection.Font.Bold = True          'bold it
    Selection.HomeKey unit:=wdStory
    frmName = "<B>"                     'find if there is another bold
    Selection.Find.Execute findtext:=frmName, Forward:=True, MatchWholeWord:=False, Wrap:=wdFindStop
End Sub

would work for simple ones like <b> and <i>  (i don't remember what &req &nbsp do anymore) .  

LVL 21

Expert Comment

by:Eric Fletcher
ID: 10918711
A bit of clarification to my earlier comment...

The "\" character is necessary before the "<" and ">" because both these characters have special meanings in a wildcard search (beginning and end of words). See Word's help for a more complete rundown on how to use wildcards. /Eric
Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.


Author Comment

ID: 10920204
Eric -

What does the "\2" mean?  When I tried it on <i>..</i>, it just removed the <i> and </i> tags.  It did not make the selection italics or boldface.

I am about to try the save as HTML file and then reload into Word.  I am thinking of something like this:

FSO - save file as textfile with extension .htm
Create a new Word document in my VBA
Open the .htm file into the new Word Document
Make the Selection All
Copy and Paste into my original Word Document

Something like that?

Thank you both for your help.

Author Comment

ID: 10922961
Gilbar -

I am looking at saving the file as an .htm file then opening it in Word and having Word do the conversion.  Here is what I came up with so far...

                Documents.Open ("WordTemp.htm")
                i = 1
                While i <= Documents.Count
                    If Len(Documents.Item(i).Content.Text) < 100 Then
                        strProduct = Documents.Item(i).Content.Text
                    End If
                    i = i + 1
                rowNew.Range.Cells.Item(2).Range.Text = strProduct

And the HTML tags are stripped out.  However, the formatting is lost as well...so my <i>..</i> text is not italicized.

Any ideas?


Author Comment

ID: 10923224
Try this code block instead:

                Documents.Open ("WordTemp.htm")
                strProduct = Documents("WordTemp.htm").Content
                Documents("WordTemp.htm").Close SaveChanges:=wdDoNotSaveChanges
                rowNew.Range.Cells.Item(2).Range.Text = strProduct

The code is simpler, but the result is the same:  no italicized text, although the HTML tags have been removed.
LVL 21

Assisted Solution

by:Eric Fletcher
Eric Fletcher earned 200 total points
ID: 10927146
Jason: The "\2" in the Replace with represents the 2nd group from the Find what part of the F&R dialog -- in this case, the "(*)" part which is whatever is between the html tags.

If you lost the html tags but didn't see any bold, you probably didn't have the format set to bold in the Replace with part of the dialog (be sure "Font: Bold" appears). If you do it manually per my instructions, it will definitely work.

However, if you recorded a macro to do it, you'll need to add some extra lines to manage the formatting part of the replace. For some reason, Word doesn't seem to record that part of the dialog! Here is what gets recorded with my added lines as indicated:

Sub Macro2()
    With Selection.Find
        .Text = "(\<b\>)(*)(\</b\>)"
'-- set bold for the find part to false (not essential but good practice)
        .Font.Bold = False
        .Replacement.Text = "\2"
'-- set the format for the replacement text to bold
        .Replacement.Font.Bold = True
        .Forward = True
        .Wrap = wdFindContinue
        .Format = True
        .MatchCase = False
        .MatchWholeWord = False
        .MatchAllWordForms = False
        .MatchSoundsLike = False
        .MatchWildcards = True
    End With
    Selection.Find.Execute Replace:=wdReplaceAll
End Sub

Note that this leaves the F&R box 'loaded' with the options youv'e set up: if you are doing a macro, it would be a good idea to end it by resetting the F&R options (clear formatting, turn off wildcards...)

Some time ago, I set up a somewhat similar macro to clean up HTML text for a very specific project. It used the approach above to convert internal formatting html tags (bold, italic, strong) as well as change paragraphs with bullets and heading level styles into Word format. I could probably dredge it up from some backup but it would have to wait until after tax time (Apr 30 in Canada!). This should give you a good start on getting the same thing.

Accepted Solution

gilbar earned 400 total points
ID: 10928057
Jason, you're losing your formatting when you put it into the string, instead try

then selection.paste after selecting where you what it (the cell or what ever)

Author Comment

ID: 10931454
Thank you, both Eric and Gilbar.

Creating a new document and mixing the Selection.WholeStory, .Cut and .Paste worked.  It converted <b>, <i>, &reg; and <img src="http://"> into display in my original Word Document.  And if any new tags (<u> for example) appear later, I won't have to recode in order to get it to work.

The other 2 ideas were very, very helpful, and through them I feel confident I could have "hacked" everything except for the <img src> tags through that.  But this works much better.

I GREATLY appreciate your time!!!


Expert Comment

ID: 10931544
you're welcome jason!
now humor a person who is obviously going senile and remind me what &req &nbsp do.  I used to know, back in the twentyth century :)

Author Comment

ID: 10932083
me too, I haven't done HTML since 1999!

&nbsp; is non-breaking space.  it's a " ".
&reg; is a Registered symbol.  The ® symbol.
There's one for TM as well...

Expert Comment

ID: 10932142

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The Selection object is designed for user interaction. It has a Range property, so it can be used in most places that a Range object can. Recorded macros must use the Selection because they are simply copying what the user is doing. A Range prope…
Shortcuts in Word Just the other day I had a training for Microsoft and they wanted me to show how well the new Windows and Office behaved on a touch device, which by the way is great, but it was only then that I realized that using keyboard shortc…
This video shows and describes the main difference between both orientations in Microsoft Word. Viewers will understand when to use each orientation and how to get the most out of them.
Office 365 is currently available in five editions. Three of them are for business use: Office 365 Business Essentials, Office 365 Business, and Office 365 Business Premium. Two of them are for home/personal use: Office 365 Home and Office 365 Perso…
Suggested Courses

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question