I want to parse out all of the lines of code in an HTML document and present them as simple text. So say I have this line of HTML text, pulled from an HTML doc:
">My web site link!</A><BR>"
How could I parse this down to just "http://www.mywebsite.com
" and "My Web site link!"?
I tried setting a scalar variable to represent a single line of code, parsed from the HTML, but the s/// didn't work. If $myVar is set to the above line of HTML, then shouldn't:
$_ = $myVar;
s/(\< | \<\/)\..\>//g; #or something close to that
$myVar = $_;
at least parse out the <LI>, the </A>, and the <BR>? (Or, for that matter, any tag in the <xx> or </xx> form?) I get nothing back, no change to $myVar at all, even when I tried s/href/dogs/g; What am I doing wrong? And if anyone can clean up the code in general...I think the problem is the quotes that come in with the HTML line, do I have to break it all down, escape the quotes, put it all back together, and then parse?.