displaying URLs with lynx -dump


I am programming a PHP script in which I need to convert web pages to plain text. Currently, I am using something like this:

$text =`lynx -cfg lcfg.cfg -dump http://www.google.com/`;

which simply assigns the text dumped by lynx to a variable $text...that is, it is equavalent to doing this from shell:
lynx -cfg lcfg.cfg -dump http://www.google.com/

this produces the following output:


    Web    [1]Images    [2]Groups    [3]News    [4]Froogle    [5]more »

   Google Search I'm Feeling Lucky   [6]Advanced Search
     [8]Language Tools

    [9]Advertising Programs - [10]Business Solutions - [11]About Google

              ©2004 Google - Searching 4,285,199,774 web pages


   1. http://www.google.com/imghp?hl=en&tab=wi&ie=UTF-8
   2. http://www.google.com/grphp?hl=en&tab=wg&ie=UTF-8
   3. http://www.google.com/nwshp?hl=en&tab=wn&ie=UTF-8
   4. http://www.google.com/froogle?hl=en&tab=wf&ie=UTF-8
   5. http://www.google.com/options/index.html
   6. http://www.google.com/advanced_search?hl=en
   7. http://www.google.com/preferences?hl=en
   8. http://www.google.com/language_tools?hl=en
   9. http://www.google.com/ads/
  10. http://www.google.com/services/
  11. http://www.google.com/about.html

This is great, except I would like the URLs to be included in the text itself. For example, instead of  
I would like something like

Does anyone know a command-line flag  or configuration that would let me do something like this with lynx? I am new to lynx, so I may need a little help getting it to work.

Thanks in advance for your help.

LVL 15
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Karl Heinz KremerCommented:
I don't think this is possible. I'm not aware of any configuration setting or command line option that would do this. w3m (similar functionality as lynx) also does not offer this feature.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
nicholassolutionsAuthor Commented:
Thanks that is kind of what I thought...
I too think the same ... however it should be possible to use/modify some existing scripts to get the kind of functionality that you are looking for ... I shall try to post a script later in the day/week

I think it should be easier to do it using perl ... what do you say khkremer?
nicholassolutionsAuthor Commented:
Yes, I'd thought of that too...since each link that is referenced appears as e.g. [1]link1, [2]link2, etc., it is not too hard to tack on the links given the references...I was just looking for the "easy way out"...Actually I was concerned about pages containing bracketed numbers confusing my parser, or at least that is my story ;)

Thanks to both of you for your help -- I'll assign pts shortly.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux OS Dev

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.