Solved

how to extract all the hyperlinks on this webpage

Posted on 2012-12-29
7
575 Views
Last Modified: 2012-12-29
on this web page http://www.scie-socialcareonline.org.uk/topics.asp?guid=64f07a36-85f2-4aac-a862-61b9116190ad if we click on expand all in the list of browse topics. How can we extract all the hyperlinks of the with titles like adoption, access to birth records etc
0
Comment
Question by:mmalik15
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
7 Comments
 
LVL 75

Accepted Solution

by:
käµfm³d   👽 earned 500 total points
ID: 38729997
The subsections are displayed by simply changing the display style from none to block, and the links exist in the source HTML (i.e. they are not pulled via AJAX). For this reason you should be able to just select all the links within that section.

If you're still using Html Agility Pack, then you could do:

doc.DocumentNode.SelectNodes("//span[@class='branch']//a[not(starts-with(@href, 'javascript:'))]")

Open in new window

0
 

Author Comment

by:mmalik15
ID: 38730014
Many thanks again kaufmed..

how can i exclude rss link in the xpath? Apart from that its working fine.

Also could you kindly tell me any xpath tool to extract the information from html DOM or what's the best approach to write xpath for html dom?
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 38730026
Oh, sorry. I meant to exclude that as well:

doc.DocumentNode.SelectNodes("//span[@class='branch']//a[not(starts-with(@href, 'javascript:')) and not(starts-with(@href, 'rss/'))]")

Open in new window

0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:mmalik15
ID: 38730035
Brilliant kaufmed. Its working perfectly.

I use Altova to test any xpath on xml documents but wonder if  there is a similar tool to test Html DOM.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 38730041
I don't know of any. HTML is becoming more in line with XML with new standards that are released. Most of the frameworks people use today to build HTML do so such that the HTML is well-formed (similar to XML). As such, you should be able to use Altova on any well-formed HTML since HTML is (technically) a subset of XML (even though HTML was around first). Unless you are dealing with someone who hand-code their web page, you should be OK using Altova.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 38730044
P.S.

One of the reasons HTML Agility Pack is so popular is that the team sought to make a library that could handle (as best as one can) mal-formed HTML. HAP takes some liberties in making the source HTML well-formed so that you can use XPath against the loaded document.
0
 

Author Closing Comment

by:mmalik15
ID: 38730054
Thanks kaufmed... Its worth having EE membership because of the presence of people like you!
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

ASP.Net to Oracle Connectivity Recently I had to develop an ASP.NET application connecting to an Oracle database.As I am doing it first time ,I had to solve several problems. This article will help to such developers  to develop an ASP.NET client…
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
Monitoring a network: how to monitor network services and why? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the philosophy behind service monitoring and why a handshake validation is critical in network monitoring. Software utilized …
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…

696 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question