Solved

Using WGET

Posted on 2011-02-15
10
1,029 Views
Last Modified: 2012-05-11
I want to be able to retrieve all *.csv files from a folder on a website like this
http://www.somedomain.com/my_folder/  and copy all files into a folder named somedomain.com on a Windows machine

THe names of the CSV files vary all the time

I want to do this if possible with Wget, I have not been able to figure it out myself
0
Comment
Question by:weegiraffe
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
  • 2
  • +2
10 Comments
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 250 total points
ID: 34902180
wget -r -P /path/to/somedomain.com -nd -A.csv http://www.somedomain.com/my_folder/

wmp
0
 
LVL 6

Assisted Solution

by:Bxoz
Bxoz earned 250 total points
ID: 34902252

Wget -r -nd -A.csv <url>

-r makes it recursive
-nd no directories
-A.csv means all csvfiles on page
0
 
LVL 6

Expert Comment

by:t-max
ID: 34902284
Since you can't do 'wget http://www.somedomain.com/my_folder/*.csv', the approach I think about would be like this:
* wget an html file (eg: index.html), which contains the csv names in that folder (if you allow directory listing, this shouldn't be hard to do)
* grep that file to get all the lines with .csv names, and parse those lines to leave the full file name
* loop through those names with wget to download them
* copy all the files to the folder somedomain.com, which was previously mounted using a samba client
Best regards
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 

Author Comment

by:weegiraffe
ID: 34904420
Thank you all for your suggestions

wget -r -P /path/to/somedomain.com -nd -A.csv http://www.somedomain.com/my_folder/

is not working.  

Neither is Wget -r -nd -A.csv <url>

t-max's suggestion is far too complicated, the job could be done more efficently with FTP

I will attempt to clarify my requirements:
I need to pull down only CSV files from a specific directory on a website to a Windows File Server.

The website directory name containing the files is always the same, so there is no need to search other directories on the website.

The files are always CSVs but the name varies.

These steps have to be repeated on 50 sites, and this number will grow.

effectively I need to download all CSV from URL such as  :

http://www.somedomain.com/my_folder/ 
http://www.somedomain1.com/my_folder/ 
http://www.somedomain2.com/my_folder/
etc
to Windows folder such as somedomain.com or somedomain1.com  or somedomain2.com etc...

It would ideal if Wget could also create these target folder if they do not exist.

I know I can do all of this, with the Windows command line (or any) FTP client but, but a bunch of WGET calls in a batch file would make it easier.

If WGET is not the correct tool , then just please say so.




0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 34904428
What do you mean with "is not working"? Error messages? Undesired results? Please clarify!

wmp
0
 
LVL 6

Expert Comment

by:Bxoz
ID: 34905202
I think you can use the GnuWin32 tools
There is wget tool for windows

http://gnuwin32.sourceforge.net/packages/wget.htm
0
 
LVL 12

Expert Comment

by:mccracky
ID: 34907077
The other obvious option is to use rsync.  
0
 

Author Comment

by:weegiraffe
ID: 34908308
This seems to be the closest I can get

wget  -r -l1 --no-parent -A.csv  http://somedomain.com/my_folder/



0
 

Author Closing Comment

by:weegiraffe
ID: 34908331
I have found the solution
0
 

Author Comment

by:weegiraffe
ID: 34908416
************** CORRECT SOLUTION IS HERE **************

I am not sure what I did wrong but this is actually the correct solution

This seems to be the closest I can get

wget  -r -l1 --no-parent -A.csv  http://somedomain.com/my_folder/

Not the accepted solution....I appear to have made mistake, but I am happy to assign the points as shown as the suggestions set me on the correct path.

******** CORRECT SOLUTION IS HERE **************
0

Featured Post

Secure Your WordPress Site: 5 Essential Approaches

WordPress is the web's most popular CMS, but its dominance also makes it a target for attackers. Our eBook will show you how to:

Prevent costly exploits of core and plugin vulnerabilities
Repel automated attacks
Lock down your dashboard, secure your code, and protect your users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Little introduction about CP: CP is a command on linux that use to copy files and folder from one location to another location. Example usage of CP as follow: cp /myfoder /pathto/destination/folder/ cp abc.tar.gz /pathto/destination/folder/ab…
It’s 2016. Password authentication should be dead — or at least close to dying. But, unfortunately, it has not traversed Quagga stage yet. Using password authentication is like laundering hotel guest linens with a washboard — it’s Passé.
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Suggested Courses

630 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question