Link to home
Start Free TrialLog in
Avatar of turbojournal
turbojournal

asked on

Extension for downloading all search results images?

Is there an extension and or application for Windows or Mac that allows a user to download all images from a websites store search results and all product images from within each product from the search results?

An example would be:

Planning to purchase a few rugs but the user doesn't want to have to go to each product based on the search result and each image relating to that product right click each image and save to their computer locally.  This takes hours for the user to do and becomes daunting.

I have uncovered quite a few image downloaders for webstites both on Mac, PC, Firefox, Chrome, and a few others but the most I have gotten out of one is getting the downloader to download all images from a specific product page, but not all the products images from the total search results.
Avatar of David Johnson, CD
David Johnson, CD
Flag of Canada image

firefox browser with downThemAll extension
Avatar of turbojournal
turbojournal

ASKER

I've already tried that and it doesn't do specifically what I've stated earlier.
I doubt that you can do this with search results, because the URL associated with the search results is typically unrelated to the items that the search found, although that could vary depending on the tool used to build the website. For example, here's a Drupal-based Oriental rug site:
http://www.peerrugs.com/

If you do a search for, let's say, Sarouk, you get this URL:
http://www.peerrugs.com/search/node/sarouk

However, if you use a website download tool, it won't find the content related to the URL above, as it is actually in locations such as:

http://www.peerrugs.com/rug/sarouk-rug
http://www.peerrugs.com/rug/sarouk-feraghan-carpet-0
http://www.peerrugs.com/rug/sarouk-mahajiran-carpet

If you want to get all images (not results of a search), that's doable. I use HTTrack (free!) to download websites:
http://www.httrack.com/

I've never tried to get just images, but I think that would be possible by using its include filter (a plus sign) to include image file types and its exclude filter (a minus sign) to exclude other stuff:

User generated imageAs you can see in the Scan Rules tab of the options dialog above, it has a pre-configured check-box for gif, jpg, png, tif, and bmp files — include those.

I recently used HTTrack to download the Oriental rugs site mentioned above. It worked well and I got the rug images (mostly JPGs, some PNGs). I suppose I could have used the include/exclude feature in the Scan Rules tab to get the just the rug images, but I didn't try that — I ran with the default, which is this:

+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar

I didn't realize until just now that the default doesn't include bmp and tif files, but for most websites, png/gif/jpg will get the images, which is no doubt why HTTrack made them the default. Regards, Joe
I tested the app with http://www.rugstudio.com but am quickly getting errors and the process halts without photos. I'm trying to get all images from each item for sale.
When I saw your last post, I kicked off an HTTrack run on <http://www.rugstudio.com/>. It has been running for 28 minutes. So far, it has downloaded 584 JPGs. Here's just the first page of hits on a search for <*.jpg>:

User generated imageI set the Scan Rules to:

+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar

In the Limits tab, I left the "Maximum mirroring depth" blank and set the Maximum external depth" to 0. Regards, Joe
ASKER CERTIFIED SOLUTION
Avatar of Joe Winograd
Joe Winograd
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Bingo, that was it.  Thank you!
All images started downloading.
You're welcome! Glad to hear that it's working for you. Regards, Joe
Thanks Joe.  One last thing.  I saw how I can say to just download filters but what command do I type in to "exclude everything but .jpgs over 300x300" as an example?
There's no way that I'm aware of to exclude files based on resolution, such as those over 300x300. You may be able to achieve what you want by utilizing the option "Max size of any non-HTML file" in the Limits tab. Regards, Joe
Thanks Joe.
You're welcome.