Solved

Using WGET

Posted on 2011-02-15
10
1,023 Views
Last Modified: 2012-05-11
I want to be able to retrieve all *.csv files from a folder on a website like this
http://www.somedomain.com/my_folder/  and copy all files into a folder named somedomain.com on a Windows machine

THe names of the CSV files vary all the time

I want to do this if possible with Wget, I have not been able to figure it out myself
0
Comment
Question by:weegiraffe
  • 4
  • 2
  • 2
  • +2
10 Comments
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 250 total points
ID: 34902180
wget -r -P /path/to/somedomain.com -nd -A.csv http://www.somedomain.com/my_folder/

wmp
0
 
LVL 6

Assisted Solution

by:Bxoz
Bxoz earned 250 total points
ID: 34902252

Wget -r -nd -A.csv <url>

-r makes it recursive
-nd no directories
-A.csv means all csvfiles on page
0
 
LVL 6

Expert Comment

by:t-max
ID: 34902284
Since you can't do 'wget http://www.somedomain.com/my_folder/*.csv', the approach I think about would be like this:
* wget an html file (eg: index.html), which contains the csv names in that folder (if you allow directory listing, this shouldn't be hard to do)
* grep that file to get all the lines with .csv names, and parse those lines to leave the full file name
* loop through those names with wget to download them
* copy all the files to the folder somedomain.com, which was previously mounted using a samba client
Best regards
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 

Author Comment

by:weegiraffe
ID: 34904420
Thank you all for your suggestions

wget -r -P /path/to/somedomain.com -nd -A.csv http://www.somedomain.com/my_folder/

is not working.  

Neither is Wget -r -nd -A.csv <url>

t-max's suggestion is far too complicated, the job could be done more efficently with FTP

I will attempt to clarify my requirements:
I need to pull down only CSV files from a specific directory on a website to a Windows File Server.

The website directory name containing the files is always the same, so there is no need to search other directories on the website.

The files are always CSVs but the name varies.

These steps have to be repeated on 50 sites, and this number will grow.

effectively I need to download all CSV from URL such as  :

http://www.somedomain.com/my_folder/ 
http://www.somedomain1.com/my_folder/ 
http://www.somedomain2.com/my_folder/
etc
to Windows folder such as somedomain.com or somedomain1.com  or somedomain2.com etc...

It would ideal if Wget could also create these target folder if they do not exist.

I know I can do all of this, with the Windows command line (or any) FTP client but, but a bunch of WGET calls in a batch file would make it easier.

If WGET is not the correct tool , then just please say so.




0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34904428
What do you mean with "is not working"? Error messages? Undesired results? Please clarify!

wmp
0
 
LVL 6

Expert Comment

by:Bxoz
ID: 34905202
I think you can use the GnuWin32 tools
There is wget tool for windows

http://gnuwin32.sourceforge.net/packages/wget.htm
0
 
LVL 12

Expert Comment

by:mccracky
ID: 34907077
The other obvious option is to use rsync.  
0
 

Author Comment

by:weegiraffe
ID: 34908308
This seems to be the closest I can get

wget  -r -l1 --no-parent -A.csv  http://somedomain.com/my_folder/



0
 

Author Closing Comment

by:weegiraffe
ID: 34908331
I have found the solution
0
 

Author Comment

by:weegiraffe
ID: 34908416
************** CORRECT SOLUTION IS HERE **************

I am not sure what I did wrong but this is actually the correct solution

This seems to be the closest I can get

wget  -r -l1 --no-parent -A.csv  http://somedomain.com/my_folder/

Not the accepted solution....I appear to have made mistake, but I am happy to assign the points as shown as the suggestions set me on the correct path.

******** CORRECT SOLUTION IS HERE **************
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Debug VNC connection on CentOS7 server 22 84
Error Message during CentOS 7 Minimal Install 3 41
Questions about DHCP migration 5 58
centos commands 6 52
Some time ago I faced the need to use a uniform folder structure that spanned across numerous sites of an enterprise to be used as a common repository for the Software packages of the Configuration Manager 2007 infrastructure. Because the procedu…
Know what services you can and cannot, should and should not combine on your server.
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question