Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

How to write a crawler to download files?

Posted on 2011-09-22
4
Medium Priority
?
403 Views
Last Modified: 2012-05-12
Hi,

I will have the directory listing of a web server.
If this url is given, I want to download everything in that folder and subfolders.
I need to writ this from the scratch.
I was looking at the web crawler to parse the given URL and extract links.
I am told I need to parse the URL since there is always the directory listing available.
So how do I use the directory listing to download files?
Thanks.
0
Comment
Question by:dkim18
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 86

Accepted Solution

by:
CEHJ earned 2000 total points
ID: 36580824
>>I need to writ this from the scratch.

Why is that? There are web crawlers already written

>>I am told I need to parse the URL since there is always the directory listing available.

Then i assume you're crawling a specific site, since you can't otherwise rely on directory listings being available?
0
 

Author Comment

by:dkim18
ID: 36580991
Sorry. My grammar wasn't good.

I was trying to say. The client doesn't want us to use those third party tool
So I was going to look at some of the open source code and copy and use them.

What I meant to say above was I will be given a URL like http://mysite/website101/
In the website101, there is a directory listing.
The directory listing will be always available.
in the website101,
There is a folder A and f1 file, f2 file..etc
A folder has b, c, d and f3 file, f4 file
b folder has some sub folders and files

So I am new to this kind of thing.
Do I still have to the parse the directory listing?
Does the directory listing list those subfolder and files in html file (and  as a hyperlink)
So I still need to parse that directory listing, don't I?
0
 

Author Comment

by:dkim18
ID: 36581003
So I want to keep the folder structure and download files.
Website101
website101/a
website101/f1
website101/f2
website101/a/f3
website101/a/b
...etc.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 36592331
:)
0

Featured Post

The top UI technologies you need to be aware of

An important part of the job as a front-end developer is to stay up to date and in contact with new tools, trends and workflows. That’s why you cannot miss this upcoming webinar to explore the latest trends in UI technologies!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
The article shows the basic steps of integrating an HTML theme template into an ASP.NET MVC project
Viewers learn about the third conditional statement “else if” and use it in an example program. Then additional information about conditional statements is provided, covering the topic thoroughly. Viewers learn about the third conditional statement …
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.
Suggested Courses

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question