Solved

Downlload All Files From a Url

Posted on 2003-10-21
14
235 Views
Last Modified: 2010-05-18
Hi again Experts!
I'm looking for a web site, that have a lot of subtitles (zip format), but i don't have time to download all subtitles by clicking in a php link....
so, I want download all *zip files from this url:
Something like:
http://www.somesite/subtitles/*.zip

The problem is that this url, have a blank index.htm, so i can't see the zip files, and i don't have any idea
about the file names...
maybe a really long name, as 123456_789_something_833Sl62Bb.zip

So, i need a code, that search this web directory for the files, and download all to my local hard disk.

Any idea guys?
0
Comment
Question by:Spetson
  • 5
  • 4
  • 2
  • +3
14 Comments
 

Expert Comment

by:OpenSourceDeveloper
Comment Utility
well if you only want a program that can do this then I suggest "wget"

http://www.interlog.com/~tcharron/wgetwin.html
0
 

Author Comment

by:Spetson
Comment Utility
nope...
I have tryed wget, but
I just receive a index.html file :(
0
 

Expert Comment

by:OpenSourceDeveloper
Comment Utility
wget -r -l1 --no-parent -A.gif http://host/dir/

It is a bit of a kludge, but it works perfectly. `-r -l1' means to retrieve recursively (See section Advanced Options), with maximum depth of 1. `--no-parent' means that references to the parent directory are ignored (See section Directory-Based Limits), and `-A.gif' means to download only the GIF files. `-A "*.gif"' would have worked too.
Suppose you were in the middle of downloading, when Wget was interrupted. Now you do not want to clobber the files already present. It would be:

so let's say you want to download all zip files that are linked to from www.test.com you would do

wget -r -l1 --no-parent -A.zip http://www.test.com/
0
 
LVL 6

Expert Comment

by:MannSoft
Comment Utility
You need some sort of file list, wether it be an index.html linking to the files, or a directory list.  If you don't know the names of the files, how can you download them?  You can't, unless of course you write a brute force program that tries every possible filename combination up to X amount of characters.  But the admin will probably want to kick your ass if you do that :)
0
 
LVL 5

Expert Comment

by:delphized
Comment Utility
if it's only for a site, why don't you write them and ask for a CD???

chuss
0
 

Author Comment

by:Spetson
Comment Utility
The screen Show:
==================================
wget -r -l1 --no-parent -A.zip http://www.website/_subtitles_/
--07:18:17--  http://www.website:80/_subtitles_/
           => `www.website/_subtitles_/index.html'
Connecting to www.website:80... connected!
HTTP request sent, awaiting response... 200 OK
Length: 266 [text/html]

    0K ->                                                        [100%]

07:18:18 (259.77 KB/s) - `www.website/_subtitles_/index.html' saved [2
66/266]


FINISHED --07:18:18--
Downloaded: 266 bytes in 1 files
===============================================
I believe that MannSoft is right!
And you know...we are not talking about delphi here anymore...hehe
So if somebody knows some delphi code to make this works, please help!!!
Otherways... i will request a cd, like delphized said!!
Thank's guys, and sorry for my english!
0
 
LVL 6

Expert Comment

by:MannSoft
Comment Utility
I just re-read your first message...and you say:

"I'm looking for a web site, that have a lot of subtitles (zip format), but i don't have time to download all subtitles by clicking in a php link.... "

So it sounds like there IS a page that has a list of all the files you can download.  Point wget at that page and it should download all the files it links to.
0
Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

 
LVL 17

Expert Comment

by:Wim ten Brink
Comment Utility
MannSoft is right... The only way to get files through HTTP is when you know the name of these files. There are quite a few applications that use a simple trick to find these names. All you have to do is search for <A href="Whatever"> tags in a webpage and whatever the href is pointing to is just another URL where you can get more data from. Look for these kinds of references to build a list of pages that you want to download.
This technique is called webcrawling, btw. And you have to be careful since you can end in an endless loop if you're not careful.
0
 

Author Comment

by:Spetson
Comment Utility
Ok...now i understood.
I believe that i cannot make this crazy idea works, because the server admin
have a "blank index.html" in the subtitles folder, accurately to protect the directory.
And maybe, is insane  try to "guess" all the possible archives names in this directory,
using any type of loop or delphi code...

But, i will keep this topic open here, cause i already see  incredible things made by Delphi programmers :)

Therefore, I will be listening, for any suggestion or comments here.

 
 
0
 
LVL 6

Expert Comment

by:MannSoft
Comment Utility
Could you clarify what you mean by this:

"I'm looking for a web site, that have a lot of subtitles (zip format), but i don't have time to download all subtitles by clicking in a php link.... "

What php link do you have to click?  Depending on the format of the page, you should be able to build a file list from it, that you can then use to get the files with wget.
0
 

Author Comment

by:Spetson
Comment Utility
Ok!
Sorry for my poor explanations, this is because i'm not powerfull in english...
The server, use a mysql database, so when u click in a subtitle link, this link send you to another page (a cgi)
something like this:
download.php?file=289

So, the subtitles page, have about 2000 links (all linked to the same download.php) only changing the file id.
The problem is:
Before the download.php page, redirect you to the file, you MUST WAIT (in the line) about 1 minute queue, to get every file!
I just want broke this line, and get the file.
I know the directory where all the files is, but don't know all the names (filename)
:)
0
 
LVL 6

Accepted Solution

by:
MannSoft earned 300 total points
Comment Utility
Okay, I see what you mean now.  And yeah, I think there's not much you can do unless by chance the files were named with some sort of a pattern.  For example, if there were file1.zip, file2.zip, file3.zip then there's a good chance that there are also file4.zip, file5.zip, file6.zip.  But if the zip name describes the contents, like winamp.zip, norton-antivirus.zip, smartftp.zip, then of course that makes it near impossible to guess what else is there.
0
 
LVL 26

Expert Comment

by:EddieShipman
Comment Utility
I did something like this not too long ago. Let me see if I can find it.
0
 

Author Comment

by:Spetson
Comment Utility
MannSoft said:
"it near impossible to guess what else is there."

Yes.. I think you're right man!
I had worked hard, and no success to find a solution to my question...hehe!

But, as you had support me with this question, i decided to give to you this 300 points ok?
Because i really hate "Just Close" a topic, so...
Thank's a lot for your comments...

See ya!
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Hello everybody This Article will show you how to validate number with TEdit control, What's the TEdit control? TEdit is a standard Windows edit control on a form, it allows to user to write, read and copy/paste single line of text. Usua…
In my programming career I have only very rarely run into situations where operator overloading would be of any use in my work.  Normally those situations involved math with either overly large numbers (hundreds of thousands of digits or accuracy re…
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now