Link to home
Start Free TrialLog in
Avatar of mmalik15
mmalik15

asked on

how to write a crawler to extract all links from a website

I want to write an application which can accept a start url of a website and return me all the links present on that website and throw them in a text file.
Any ideas ?
thanks
Avatar of Om Prakash
Om Prakash
Flag of India image

pl check the following link
http://www.dotnetperls.com/scraping-html
SOLUTION
Avatar of kaufmed
kaufmed
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of mmalik15
mmalik15

ASKER

Many thanks guys.
NP. Glad to help  = )

One thing to note is that my code is not a full solution. For example, you may end up crawling the whole web because my code just searches for all links on a page--it doesn't confirm whether or not the links point to the current site or not. You should be able to account for this easily; just know that that "flaw" is there.