Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Download web pages quickly

Posted on 2011-03-18
6
405 Views
Last Modified: 2012-05-11
Here is a web page, there are many links. I don't want click them one by one.
Any method to extract the links and fast download?

http://msdn.microsoft.com/en-us/library/ff846392.aspx

Thanks
0
Comment
Question by:zhshqzyc
  • 3
  • 2
6 Comments
 
LVL 18

Expert Comment

by:Dennis Aries
ID: 35166242
You can download the page and parse it as an XML-file.
After that, you can loop through the a-tags and download the given reference.

Developer.Com has a nice article on parsing HTML to an XML-document that might be of some use to you.
0
 
LVL 6

Expert Comment

by:akajohn
ID: 35166249
If you want to mirror a web site (ffline). Try Teleport Pro.

http://www.tenmax.com/teleport/pro/home.htm

Otherwise if I have a lot of links to download , I normally add all of them to a text file and them ask wget (www.gnu.org/software/wget/) to download it for me.
0
 

Author Comment

by:zhshqzyc
ID: 35166410
Your methods are good but not specific for the web pages. I want to write code to download them.
Please notice they have format like
<a href="http://msdn.microsoft.com/en-us/library/ff846370.aspx" title="Excel 2010 Developer Reference">Excel 2010 Developer Reference</a></div>
<a href="http://msdn.microsoft.com/en-us/library/ff846437.aspx" title="AllowEditRange Object">AllowEditRange Object</a></div>

Open in new window

Any regular expresstion to collect links?
0
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 6

Expert Comment

by:akajohn
ID: 35166453
So if I understood you correctly you want to write a code in C# , VB to extract links from a Web Page and then download them individually ?

Thanks for clarifying.

A>
0
 

Accepted Solution

by:
zhshqzyc earned 0 total points
ID: 35166482
Not sure right or not
Regex reg = @"^<a\s(href="http://msdn.microsoft.com/en-us/library/ff)\d+(.aspx")";

Open in new window

0
 

Author Closing Comment

by:zhshqzyc
ID: 35225389
Fig it out by myself.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Problem Hi all,    While many today have fast Internet connection, there are many still who do not, or are connecting through devices with a slower connect, so light web pages and fast load times are still popular.    If your ASP.NET page …
Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question