Link to home
Start Free TrialLog in
Avatar of nighi
nighiFlag for United States of America

asked on

Scraping using PHP and xpath

Hi,

I am looking for some assistance to scrap the links using php (curl +xpath) from below location. I tried the xpath way using php but I am unsuccessful. If xpath cannot be used, any other php mechanism is also acceptable.

Input Link: http://network.nighi.com/video?page=2

output: 20 links (e.g. http://network.nighi.com/video/video/show?id=549954%3AVideo%3A140678)
Avatar of Graceful_Penguin
Graceful_Penguin
Flag of South Africa image

The problem with this page is that it is not proper xml. As I see it you have to options try to xml/xpath or use regexp.
I would suggest xml because the data makes more sense in a xml why. To do an xpath or xslt in this page you should first convert this page into proper xhtml or xml. Try to load the page into a DomDocument with the loadHTML() function. If this works you should be able to xpath or xslt the data to get the links. Just ask if you have any issues.
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of nighi

ASKER

Thanks for you help guys.... I used Ray's code (slightly modified to output only links).
Avatar of nighi

ASKER

Thanks for you help guys.... I used Ray's code (slightly modified to output only links).