Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Using WGET

Posted on 2011-02-15
10
Medium Priority
?
1,031 Views
Last Modified: 2012-05-11
I want to be able to retrieve all *.csv files from a folder on a website like this
http://www.somedomain.com/my_folder/  and copy all files into a folder named somedomain.com on a Windows machine

THe names of the CSV files vary all the time

I want to do this if possible with Wget, I have not been able to figure it out myself
0
Comment
Question by:weegiraffe
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
  • 2
  • +2
10 Comments
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 750 total points
ID: 34902180
wget -r -P /path/to/somedomain.com -nd -A.csv http://www.somedomain.com/my_folder/

wmp
0
 
LVL 6

Assisted Solution

by:Bxoz
Bxoz earned 750 total points
ID: 34902252

Wget -r -nd -A.csv <url>

-r makes it recursive
-nd no directories
-A.csv means all csvfiles on page
0
 
LVL 6

Expert Comment

by:t-max
ID: 34902284
Since you can't do 'wget http://www.somedomain.com/my_folder/*.csv', the approach I think about would be like this:
* wget an html file (eg: index.html), which contains the csv names in that folder (if you allow directory listing, this shouldn't be hard to do)
* grep that file to get all the lines with .csv names, and parse those lines to leave the full file name
* loop through those names with wget to download them
* copy all the files to the folder somedomain.com, which was previously mounted using a samba client
Best regards
0
Moving data to the cloud? Find out if you’re ready

Before moving to the cloud, it is important to carefully define your db needs, plan for the migration & understand prod. environment. This wp explains how to define what you need from a cloud provider, plan for the migration & what putting a cloud solution into practice entails.

 

Author Comment

by:weegiraffe
ID: 34904420
Thank you all for your suggestions

wget -r -P /path/to/somedomain.com -nd -A.csv http://www.somedomain.com/my_folder/

is not working.  

Neither is Wget -r -nd -A.csv <url>

t-max's suggestion is far too complicated, the job could be done more efficently with FTP

I will attempt to clarify my requirements:
I need to pull down only CSV files from a specific directory on a website to a Windows File Server.

The website directory name containing the files is always the same, so there is no need to search other directories on the website.

The files are always CSVs but the name varies.

These steps have to be repeated on 50 sites, and this number will grow.

effectively I need to download all CSV from URL such as  :

http://www.somedomain.com/my_folder/ 
http://www.somedomain1.com/my_folder/ 
http://www.somedomain2.com/my_folder/
etc
to Windows folder such as somedomain.com or somedomain1.com  or somedomain2.com etc...

It would ideal if Wget could also create these target folder if they do not exist.

I know I can do all of this, with the Windows command line (or any) FTP client but, but a bunch of WGET calls in a batch file would make it easier.

If WGET is not the correct tool , then just please say so.




0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34904428
What do you mean with "is not working"? Error messages? Undesired results? Please clarify!

wmp
0
 
LVL 6

Expert Comment

by:Bxoz
ID: 34905202
I think you can use the GnuWin32 tools
There is wget tool for windows

http://gnuwin32.sourceforge.net/packages/wget.htm
0
 
LVL 12

Expert Comment

by:mccracky
ID: 34907077
The other obvious option is to use rsync.  
0
 

Author Comment

by:weegiraffe
ID: 34908308
This seems to be the closest I can get

wget  -r -l1 --no-parent -A.csv  http://somedomain.com/my_folder/



0
 

Author Closing Comment

by:weegiraffe
ID: 34908331
I have found the solution
0
 

Author Comment

by:weegiraffe
ID: 34908416
************** CORRECT SOLUTION IS HERE **************

I am not sure what I did wrong but this is actually the correct solution

This seems to be the closest I can get

wget  -r -l1 --no-parent -A.csv  http://somedomain.com/my_folder/

Not the accepted solution....I appear to have made mistake, but I am happy to assign the points as shown as the suggestions set me on the correct path.

******** CORRECT SOLUTION IS HERE **************
0

Featured Post

Command Line Tips and Tricks

The command line is a powerful tool at the disposal of every Linux user. Although Linux distros come with beautiful user interfaces, it's worthwhile to learn the command line because it allows you to do a number of things that you otherwise cannot do from the GUI.  

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Setting up Secure Ubuntu server on VMware 1.      Insert the Ubuntu Server distribution CD or attach the ISO of the CD which is in the “Datastore”. Note that it is important to install the x64 edition on servers, not the X86 editions. 2.      Power on th…
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
Suggested Courses

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question