Link to home
Start Free TrialLog in
Avatar of nicholassolutions
nicholassolutionsFlag for United States of America

asked on

displaying URLs with lynx -dump

Hello,

I am programming a PHP script in which I need to convert web pages to plain text. Currently, I am using something like this:

$text =`lynx -cfg lcfg.cfg -dump http://www.google.com/`;

which simply assigns the text dumped by lynx to a variable $text...that is, it is equavalent to doing this from shell:
lynx -cfg lcfg.cfg -dump http://www.google.com/

this produces the following output:


                                   Google

    Web    [1]Images    [2]Groups    [3]News    [4]Froogle    [5]more »

     _______________________________________________________
   Google Search I'm Feeling Lucky   [6]Advanced Search
     [7]Preferences
     [8]Language Tools

    [9]Advertising Programs - [10]Business Solutions - [11]About Google

              ©2004 Google - Searching 4,285,199,774 web pages

References

   1. http://www.google.com/imghp?hl=en&tab=wi&ie=UTF-8
   2. http://www.google.com/grphp?hl=en&tab=wg&ie=UTF-8
   3. http://www.google.com/nwshp?hl=en&tab=wn&ie=UTF-8
   4. http://www.google.com/froogle?hl=en&tab=wf&ie=UTF-8
   5. http://www.google.com/options/index.html
   6. http://www.google.com/advanced_search?hl=en
   7. http://www.google.com/preferences?hl=en
   8. http://www.google.com/language_tools?hl=en
   9. http://www.google.com/ads/
  10. http://www.google.com/services/
  11. http://www.google.com/about.html

This is great, except I would like the URLs to be included in the text itself. For example, instead of  
[1]Images
I would like something like
Images[http://www.google.com/imghp?hl=en&tab=wi&ie=UTF-8]

Does anyone know a command-line flag  or configuration that would let me do something like this with lynx? I am new to lynx, so I may need a little help getting it to work.

Thanks in advance for your help.

Cheer,
Matt
ASKER CERTIFIED SOLUTION
Avatar of Karl Heinz Kremer
Karl Heinz Kremer
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of nicholassolutions

ASKER

Thanks that is kind of what I thought...
SOLUTION
Avatar of sunnycoder
sunnycoder
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Yes, I'd thought of that too...since each link that is referenced appears as e.g. [1]link1, [2]link2, etc., it is not too hard to tack on the links given the references...I was just looking for the "easy way out"...Actually I was concerned about pages containing bracketed numbers confusing my parser, or at least that is my story ;)

Thanks to both of you for your help -- I'll assign pts shortly.