?
Solved

Download web pages quickly

Posted on 2011-03-18
6
Medium Priority
?
421 Views
Last Modified: 2012-05-11
Here is a web page, there are many links. I don't want click them one by one.
Any method to extract the links and fast download?

http://msdn.microsoft.com/en-us/library/ff846392.aspx

Thanks
0
Comment
Question by:zhshqzyc
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 18

Expert Comment

by:Dennis Aries
ID: 35166242
You can download the page and parse it as an XML-file.
After that, you can loop through the a-tags and download the given reference.

Developer.Com has a nice article on parsing HTML to an XML-document that might be of some use to you.
0
 
LVL 6

Expert Comment

by:akajohn
ID: 35166249
If you want to mirror a web site (ffline). Try Teleport Pro.

http://www.tenmax.com/teleport/pro/home.htm

Otherwise if I have a lot of links to download , I normally add all of them to a text file and them ask wget (www.gnu.org/software/wget/) to download it for me.
0
 

Author Comment

by:zhshqzyc
ID: 35166410
Your methods are good but not specific for the web pages. I want to write code to download them.
Please notice they have format like
<a href="http://msdn.microsoft.com/en-us/library/ff846370.aspx" title="Excel 2010 Developer Reference">Excel 2010 Developer Reference</a></div>
<a href="http://msdn.microsoft.com/en-us/library/ff846437.aspx" title="AllowEditRange Object">AllowEditRange Object</a></div>

Open in new window

Any regular expresstion to collect links?
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 6

Expert Comment

by:akajohn
ID: 35166453
So if I understood you correctly you want to write a code in C# , VB to extract links from a Web Page and then download them individually ?

Thanks for clarifying.

A>
0
 

Accepted Solution

by:
zhshqzyc earned 0 total points
ID: 35166482
Not sure right or not
Regex reg = @"^<a\s(href="http://msdn.microsoft.com/en-us/library/ff)\d+(.aspx")";

Open in new window

0
 

Author Closing Comment

by:zhshqzyc
ID: 35225389
Fig it out by myself.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

International Data Corporation (IDC) prognosticates that before the current the year gets over disbursing on IT framework products to be sent in cloud environs will be $37.1B.
This article aims to explain the working of CircularLogArchiver. This tool was designed to solve the buildup of log file in cases where systems do not support circular logging or where circular logging is not enabled
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
Suggested Courses

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question