I am making a little freeware app that requires working with an HTML page's plain text. I've found some code to do simple tag stripping (the same I was thinking, just remove everything between '<' and '>'), but I fear this may not be very reliable.
So my question is,
Should I just take out everything between tag openers and closers, or is there a more *intelligent solution for stripping all code from an HTML page and leaving only the readable text?
Any ideas are welcome. Thank you.