Stripping HTML from an Excel column and replacing it with plain text

Hi,

I have an Excel file that has a field called "Description".  This field has got HTML in it, and it prints 2-3 HTML formatted lines.  The format includes bulleted lists and bolded/italicized text.

Does someone have a way of converting this ENTIRE column into plain text, i.e. stripping of the HTML completely?
jmohsinAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

wasiftoorCommented:
You can use the following function

Function HtmlToText(sHTML) As String
  Dim oDoc As HTMLDocument
  Set oDoc = New HTMLDocument
  oDoc.body.innerHTML = sHTML
  HtmlToText = oDoc.body.innerText
End Function

Open in new window

jmohsinAuthor Commented:
Can you please provide some simple steps on how to use this function in Excel 2010? I have an entire column that has the HTML; I want to replace the HTML in every single cell in that column with plain text.
BTW, with this function, what happens to the bulleted lists?
wasiftoorCommented:
Understood. I will provide instructions and provide a sample file as well that you can use:

Step-1: Press Alt+F11 to open up MS Visual Basic to enter your VBA code
Step-2: Add Reference to Microsoft HTML Object. Go To Tools > References . Select Microsoft HTML Object Library.
Step-3: Right click on the the "ThisWorkbook" node under VBA Project. Select Insert Module. This will create Module 1
Step-4: Paste the Above code in the Code Window on the Right. Hit Save
Step-5: If column A is the column with HTML, than you need to use this formula in the adjacent column. If A1 has HTML you can now use the following formula in B1.

=HtmlToText(A1)

Open in new window


Once the text is extracted you can drag the same formula to all cells in the column and this will extract the text. I am including a file that will show exactly what i have described. I am put HTML code with unordered list (bullets) in the column to demonstrate the exact behavior. Click on cells B1 and B2 to see how the formula is applied.

Hopefully this will help. Good Luck!
HTMLTextExtract.xlsm

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jmohsinAuthor Commented:
Thanks for the excellent response!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Excel

From novice to tech pro — start learning today.