Solved

Extension for downloading all search results images?

Posted on 2014-04-05
13
391 Views
Last Modified: 2014-04-12
Is there an extension and or application for Windows or Mac that allows a user to download all images from a websites store search results and all product images from within each product from the search results?

An example would be:

Planning to purchase a few rugs but the user doesn't want to have to go to each product based on the search result and each image relating to that product right click each image and save to their computer locally.  This takes hours for the user to do and becomes daunting.

I have uncovered quite a few image downloaders for webstites both on Mac, PC, Firefox, Chrome, and a few others but the most I have gotten out of one is getting the downloader to download all images from a specific product page, but not all the products images from the total search results.
0
Comment
Question by:turbojournal
  • 6
  • 6
13 Comments
 
LVL 80

Expert Comment

by:David Johnson, CD, MVP
ID: 39980822
firefox browser with downThemAll extension
0
 

Author Comment

by:turbojournal
ID: 39980829
I've already tried that and it doesn't do specifically what I've stated earlier.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 39981734
I doubt that you can do this with search results, because the URL associated with the search results is typically unrelated to the items that the search found, although that could vary depending on the tool used to build the website. For example, here's a Drupal-based Oriental rug site:
http://www.peerrugs.com/

If you do a search for, let's say, Sarouk, you get this URL:
http://www.peerrugs.com/search/node/sarouk

However, if you use a website download tool, it won't find the content related to the URL above, as it is actually in locations such as:

http://www.peerrugs.com/rug/sarouk-rug
http://www.peerrugs.com/rug/sarouk-feraghan-carpet-0
http://www.peerrugs.com/rug/sarouk-mahajiran-carpet

If you want to get all images (not results of a search), that's doable. I use HTTrack (free!) to download websites:
http://www.httrack.com/

I've never tried to get just images, but I think that would be possible by using its include filter (a plus sign) to include image file types and its exclude filter (a minus sign) to exclude other stuff:

HTTrack Include-Exclude filtersAs you can see in the Scan Rules tab of the options dialog above, it has a pre-configured check-box for gif, jpg, png, tif, and bmp files — include those.

I recently used HTTrack to download the Oriental rugs site mentioned above. It worked well and I got the rug images (mostly JPGs, some PNGs). I suppose I could have used the include/exclude feature in the Scan Rules tab to get the just the rug images, but I didn't try that — I ran with the default, which is this:

+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar

I didn't realize until just now that the default doesn't include bmp and tif files, but for most websites, png/gif/jpg will get the images, which is no doubt why HTTrack made them the default. Regards, Joe
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 

Author Comment

by:turbojournal
ID: 39995912
I tested the app with http://www.rugstudio.com but am quickly getting errors and the process halts without photos. I'm trying to get all images from each item for sale.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 39996277
When I saw your last post, I kicked off an HTTrack run on <http://www.rugstudio.com/>. It has been running for 28 minutes. So far, it has downloaded 584 JPGs. Here's just the first page of hits on a search for <*.jpg>:

HTTTrack on RugStudio siteI set the Scan Rules to:

+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar

In the Limits tab, I left the "Maximum mirroring depth" blank and set the Maximum external depth" to 0. Regards, Joe
0
 
LVL 53

Accepted Solution

by:
Joe Winograd, EE MVE earned 500 total points
ID: 39996287
Update on my last post: I stopped the HTTrack run after 36 minutes. It had downloaded 784 JPGs to my PC. Here's one example:

tn_spinnaker_aqua_outdoorI don't know why your run gets errors and halts without photos. Works perfectly here. Regards, Joe
0
 

Author Comment

by:turbojournal
ID: 39996448
Bingo, that was it.  Thank you!
0
 

Author Closing Comment

by:turbojournal
ID: 39996451
All images started downloading.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 39996453
You're welcome! Glad to hear that it's working for you. Regards, Joe
0
 

Author Comment

by:turbojournal
ID: 39996456
Thanks Joe.  One last thing.  I saw how I can say to just download filters but what command do I type in to "exclude everything but .jpgs over 300x300" as an example?
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 39996582
There's no way that I'm aware of to exclude files based on resolution, such as those over 300x300. You may be able to achieve what you want by utilizing the option "Max size of any non-HTML file" in the Limits tab. Regards, Joe
0
 

Author Comment

by:turbojournal
ID: 39996691
Thanks Joe.
0
 
LVL 53

Expert Comment

by:Joe Winograd, EE MVE
ID: 39996735
You're welcome.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Sometimes drives fill up and we don't know why.  If you don't understand the best way to use the tools available, you may end up being stumped as to why your drive says it's not full when you have no space left!  Here's how you can find out...
In threads here at EE, each comment has a unique Identifier (ID). It is easy to get the full path for an ID via the right-click context menu. However, we often want to post a short link within a thread rather than the full link. This article shows a…
This Micro Tutorial will demonstrate how nuggets on the Web are formatted by using Chrome Developer Tools. These tools would not only view the site's CSS but it can also modify it and save the CSS to use on your own site.
With the advent of Windows 10, Microsoft is pushing a Get Windows 10 icon into the notification area (system tray) of qualifying computers. There are many reasons for wanting to remove this icon. This two-part Experts Exchange video Micro Tutorial s…

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question