Link to home
Start Free TrialLog in
Avatar of dryzone
dryzone

asked on

FTP question

I am trying to download catalogues from VIZIER.
I want to automate it on a monthly basis with CRON.
that part is trivial.

However,
How do I do recursive ftp downloads to get the catalogue and its subdirectories?
(I did try wget, but it did not work).

My current script follows here:
Please correct it to do recursive downloads of all subdirectories of a  directory.
The problem is at mget.
mget does not recurse into sub directories.
What must I replace mget with????


####################### GET the gsc catalogues ##########################
#Enter the catalogue name here you want to save the files under in the /opt/catalogues directory
cataloguename=gsc
installdir=/usr/local/share/catalogs
#and set the ftp local directory to this name
localdirectory=$installdir/$cataloguename
# and create the directories if they do not exists
ls -l $installdir || mkdir $installdir
ls -l $localdirectory || mkdir $localdirectory


# Define the sitename you want to get the catalogue from
sitename=cdsarc.u-strasbg.fr
# and the contents of the directory the catalogue resides in
directory=/pub/cats/I/255/GSC_ACT

#Now download the entire directory
 ftp -inv <<!
 open  $sitename
 user  anonymous lrven@attglobal.net
 bin
 hash
 cd $directory
 #ls -l
 lcd $localdirectory
#Replace with "get" if only one file
 mget *
 quit
Avatar of j79
j79


Why don't you use a perl script for that issue?

You can find an example for recursive directory listing/download at:
http://www.the-labs.com/Perl/ThePerlJournal/Issue_03_FTP/
Avatar of dryzone

ASKER

Perl scripts work then dont work from distribution to distribution.
I have learnt to avoid it.
There is a program called wget its probably already on your box it can use recrsive mode or mirror mode on remote sites

http://wget.sunsite.dk/
Avatar of dryzone

ASKER

I tried wget see in my problem description, but it  just hangs.
It is seemingly due to the ftp password pronpt not seviced properly.
the wget manual only mentions wget and http passwords.
There are no options for entering ftp passwords.
just like any ftp site use the form of ftp://username:password@ftpsite.tld
Avatar of dryzone

ASKER

I get the same problem..it logs in but just hangs at LIST....
What I find by manually ftp'ing at this site that ls-l needs a password.
Therefore i think that wget uses ls -l and get an unexpected password pronpt..no way around that one I think.


]$ wget -r ftp://anonymous@cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT
--07:00:46--  ftp://anonymous@cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/.listing'
Resolving cdsarc.u-strasbg.fr... done.
Connecting to cdsarc.u-strasbg.fr[130.79.128.5]:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD /pub/cats/I/255 ... done.
==> PORT ... done.    ==> LIST ...
Try using passive ftp by adding --wget-passive-ftp
Avatar of dryzone

ASKER

I did that  
wget -r --passive-ftp  ftp://anonymous@cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT
but *it refuses to recurse*!

The output is the same as above only pasted from LIST onwards
==> PASV ... done.    ==> LIST ... done.

    [ <=>                                 ] 287            2.30K/s            

07:10:57 (2.30 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/.listing' saved [287]
Removed `cdsarc.u-strasbg.fr/pub/cats/I/255/.listing'.
Already have correct symlink cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT -> ../../../catx/bincats/GSC_ACT
FINISHED --07:10:57--
Downloaded: 0 bytes in 0 files

So one problem solved, but it does not recurse into subdirectories.
try using mirror -m instead of -r
Avatar of dryzone

ASKER

Still seems to halt at ".listing"

wget  --passive-ftp -m  ftp://anonymous@cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT
==> PASV ... done.    ==> LIST ... done.
    [ <=>                                 ] 287            2.30K/s            
07:23:07 (2.30 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/.listing' saved [287]
Already have correct symlink cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT -> ../../../catx/bincats/GSC_ACT
FINISHED --07:23:07--
Downloaded: 287 bytes in 1 files
You need a / on the end

wget -m --passive-ftp --retr-symlinks ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/

Avatar of dryzone

ASKER

I tried that.
IT just exits with one file downloaded.
funny it downloaded the entire site when I tried it
Avatar of dryzone

ASKER

That sure is strange.
Let me try another machine and different version of  linux.
Did you copy and paste that command
Heres an example

jcooke@alpha:~$ wget -m --passive-ftp --retr-symlinks ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/
--12:32:11--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/.listing'
Resolving cdsarc.u-strasbg.fr... 130.79.128.5
Connecting to cdsarc.u-strasbg.fr[130.79.128.5]:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD /pub/cats/I/255/GSC_ACT ... done.
==> PASV ... done.    ==> LIST ... done.

    [ <=>                                 ] 1,974         --.--K/s            

12:32:14 (43.40 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/.listing' saved [1974]

--12:32:14--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/bin-dos
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/bin-dos'
==> CWD not required.
==> PASV ... done.    ==> RETR bin-dos ...
No such file `bin-dos'.

--12:32:15--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/plates.epoch.gz
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/plates.epoch.gz'
==> CWD not required.
==> PASV ... done.    ==> RETR plates.epoch.gz ... done.
Length: 7,510

100%[====================================>] 7,510         --.--K/s            

12:32:15 (180.14 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/plates.epoch.gz' saved [7510]

--12:32:15--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/readme
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/readme'
==> CWD not required.
==> PASV ... done.    ==> RETR readme ... done.
Length: 3,927

100%[====================================>] 3,927         --.--K/s            

12:32:15 (88.76 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/readme' saved [3927]

--12:32:15--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/regions.dat.gz
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/regions.dat.gz'
==> CWD not required.
==> PASV ... done.    ==> RETR regions.dat.gz ... done.
Length: 297,178

100%[====================================>] 297,178     1000.15K/s            

12:32:15 (999.86 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/regions.dat.gz' saved [297178]

--12:32:15--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/regions.ord.gz
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/regions.ord.gz'
==> CWD not required.
==> PASV ... done.    ==> RETR regions.ord.gz ... done.
Length: 20,162

100%[====================================>] 20,162        --.--K/s            

12:32:15 (177.68 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/regions.ord.gz' saved [20162]

--12:32:15--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/src.tar.gz
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/src.tar.gz'
==> CWD not required.
==> PASV ... done.    ==> RETR src.tar.gz ... done.
Length: 75,335

100%[====================================>] 75,335        --.--K/s            

12:32:16 (507.40 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/src.tar.gz' saved [75335]

--12:32:16--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/.listing'
==> CWD /pub/cats/I/255/GSC_ACT/N0000 ... done.
==> PASV ... done.    ==> LIST ... done.

    [  <=>                                ] 37,420       118.63K/s            

12:32:16 (118.31 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/.listing' saved [37420]

--12:32:16--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/%3Ddone%3D
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/=done='
==> CWD not required.
==> PASV ... done.    ==> RETR =done= ... done.
Length: 33

100%[====================================>] 33            --.--K/s            

12:32:16 (13.59 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/=done=' saved [33]

--12:32:16--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0001.GSC
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0001.GSC'
==> CWD not required.
==> PASV ... done.    ==> RETR 0001.GSC ... done.
Length: 31,404

100%[====================================>] 31,404        --.--K/s            

12:32:17 (268.06 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0001.GSC' saved [31404]

--12:32:17--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0002.GSC
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0002.GSC'
==> CWD not required.
==> PASV ... done.    ==> RETR 0002.GSC ... done.
Length: 38,640

100%[====================================>] 38,640        --.--K/s            

12:32:17 (341.56 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0002.GSC' saved [38640]

--12:32:17--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0003.GSC
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0003.GSC'
==> CWD not required.
==> PASV ... done.    ==> RETR 0003.GSC ... done.
Length: 31,044

100%[====================================>] 31,044        --.--K/s            

12:32:17 (284.31 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0003.GSC' saved [31044]

--12:32:17--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0004.GSC
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0004.GSC'
==> CWD not required.
==> PASV ... done.    ==> RETR 0004.GSC ... done.
Length: 37,356

100%[====================================>] 37,356        --.--K/s            

12:32:17 (317.27 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0004.GSC' saved [37356]

--12:32:17--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0005.GSC
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0005.GSC'
==> CWD not required.
==> PASV ... done.    ==> RETR 0005.GSC ... done.
Length: 35,296

100%[====================================>] 35,296        --.--K/s            

12:32:17 (325.68 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0005.GSC' saved [35296]

--12:32:17--  ftp://cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0006.GSC
           => `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0006.GSC'
==> CWD not required.
==> PASV ... done.    ==> RETR 0006.GSC ... done.
Length: 33,436

100%[====================================>] 33,436        --.--K/s            

12:32:18 (227.38 KB/s) - `cdsarc.u-strasbg.fr/pub/cats/I/255/GSC_ACT/N0000/0006.GSC' saved [33436]
Avatar of dryzone

ASKER

"paranoidcookie"
I tried the exact same command  on FEDORA 1 and there it worked. I will let you know once it finished downloading, but it looks ok.

What I think the problem is here is the BROKEN PERL installation that came with RedHat 8.0. I think wget must be using perl at some stage.
With all the Perl problems I had with RH8 I became completely negative using it and probably never will program in perl again. Above all, all Perl upgrades and patches for 8.0 never worked either and just broke the system further. It forced me to abandon hordes of Perl scripts and rewrite it using really cumbersome bash, but at least bash always work from distribution to distribution.



ASKER CERTIFIED SOLUTION
Avatar of paranoidcookie
paranoidcookie
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of dryzone

ASKER

Thanks for all the help you deserve the points.
Just one last question,
Do you know if there is any command to resume ftp and not having tpo download files you already downloaded with wget?
Avatar of dryzone

ASKER

I answered it myself it is the -N switch works.
thanks a lot!
Thanks for the points made my months 3000 now :-)

-c to continue downloads