Solved

Download web pages quickly

Posted on 2011-03-18
6
402 Views
Last Modified: 2012-05-11
Here is a web page, there are many links. I don't want click them one by one.
Any method to extract the links and fast download?

http://msdn.microsoft.com/en-us/library/ff846392.aspx

Thanks
0
Comment
Question by:zhshqzyc
  • 3
  • 2
6 Comments
 
LVL 18

Expert Comment

by:Dennis Aries
ID: 35166242
You can download the page and parse it as an XML-file.
After that, you can loop through the a-tags and download the given reference.

Developer.Com has a nice article on parsing HTML to an XML-document that might be of some use to you.
0
 
LVL 6

Expert Comment

by:akajohn
ID: 35166249
If you want to mirror a web site (ffline). Try Teleport Pro.

http://www.tenmax.com/teleport/pro/home.htm

Otherwise if I have a lot of links to download , I normally add all of them to a text file and them ask wget (www.gnu.org/software/wget/) to download it for me.
0
 

Author Comment

by:zhshqzyc
ID: 35166410
Your methods are good but not specific for the web pages. I want to write code to download them.
Please notice they have format like
<a href="http://msdn.microsoft.com/en-us/library/ff846370.aspx" title="Excel 2010 Developer Reference">Excel 2010 Developer Reference</a></div>
<a href="http://msdn.microsoft.com/en-us/library/ff846437.aspx" title="AllowEditRange Object">AllowEditRange Object</a></div>

Open in new window

Any regular expresstion to collect links?
0
Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

 
LVL 6

Expert Comment

by:akajohn
ID: 35166453
So if I understood you correctly you want to write a code in C# , VB to extract links from a Web Page and then download them individually ?

Thanks for clarifying.

A>
0
 

Accepted Solution

by:
zhshqzyc earned 0 total points
ID: 35166482
Not sure right or not
Regex reg = @"^<a\s(href="http://msdn.microsoft.com/en-us/library/ff)\d+(.aspx")";

Open in new window

0
 

Author Closing Comment

by:zhshqzyc
ID: 35225389
Fig it out by myself.
0

Featured Post

Live: Real-Time Solutions, Start Here

Receive instant 1:1 support from technology experts, using our real-time conversation and whiteboard interface. Your first 5 minutes are always free.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

IntroductionWhile developing web applications, a single page might contain many regions and each region might contain many number of controls with the capability to perform  postback. Many times you might need to perform some action on an ASP.NET po…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.
Windows 10 is mostly good. However the one thing that annoys me is how many clicks you have to do to dial a VPN connection. You have to go to settings from the start menu, (2 clicks), Network and Internet (1 click), Click VPN (another click) then fi…

816 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now