crowegreg
asked on
parsing HTML tags within text field in an Access table
Within my mdb, table named products, field named description, html tags are within the data. I need to remove the html tags, but keep all the data. I can't figure out an easy way to do it.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
cheers Robert, I'd forgotten about the & codes!
s46.
s46.
ASKER
This table gets recreated on a daily basis. So I need to write a procedure to handle this. I'll start working using the code above.
1. select all the data from the table
2. paste it into the code pane of a wysiwyg html editor
3. switch to the browser (display) pane
4. copy the resulting output and paste it into notepad to strip the formatting
5. copy and paste the unformatted text back into the tabe (or import it).
The chances of this working are slim, especially if you have line breaks <b />, paragraphs <p />, lists <ul /> or <ol /> or similar.
The only other option I can think of is write a function to:
1. loop through the table, returning one description field at a time.
2. for each field, store the contents in a string variable
3. go to the first character.
4. look for the next < character.
5. from this < character, go on to the first > character.
6. look at the data between the < and >. if it looks like an HTML tag, delete the string, including the < and >; if not then goto step 7.
7. repeat from step 4 until no more < characters are found.
8. repeat from step 2 until you reach the end of the table.
Hope this helps,
s46.