I've got myself a CSV file, of MySQL descent. Originally, the file was used as a DB for a website. For that reason, the text inside the table contains tons of HTML/XML characters intended to specify the location and design of the text on a page.
While it's all swell, now that I need to get just the text out of it, it's quite a nightmare. As there are hundreds of variations of design tags within the document, there is no way I could possibly remove them all.
Could you think of a way to strip the document of all tags? I found some solutions using PHP, however, I lack knowledge of it, hence I can do little to improve the situation.