I am programming a PHP script in which I need to convert web pages to plain text. Currently, I am using something like this:
$text =`lynx -cfg lcfg.cfg -dump http://www.google.com/`
which simply assigns the text dumped by lynx to a variable $text...that is, it is equavalent to doing this from shell:
lynx -cfg lcfg.cfg -dump http://www.google.com/
this produces the following output:
Web Images Groups News Froogle more »
Google Search I'm Feeling Lucky Advanced Search
Advertising Programs - Business Solutions - About Google
©2004 Google - Searching 4,285,199,774 web pages
This is great, except I would like the URLs to be included in the text itself. For example, instead of
I would like something like
Does anyone know a command-line flag or configuration that would let me do something like this with lynx? I am new to lynx, so I may need a little help getting it to work.
Thanks in advance for your help.