Don't like the snow? You can make a bookmark with the following text instead of a url: javascript:snowStorm.toggleSnow(). Clicking it will toggle the snow on and off.
Our new Indie Games subforum is now open for business in G&T. Go and check it out, you might land a code for a free game. If you're developing an indie game and want to post about it, follow these directions. If you don't, he'll break your legs! Hahaha! Seriously though.
Our rules have been updated and given their own forum. Go and look at them! They are nice, and there may be new ones that you didn't know about! Hooray for rules! Hooray for The System! Hooray for Conforming!

Automatically downloading a series of images from a website

hoodie13hoodie13 Registered User regular
edited August 2009 in Help / Advice Forum
So hopefully the title is clear enough to get you in this thread... What I'm looking for is this. I'd like to create a collection of webcomics, purely for my own amusement and not for distribution or profit, but I'm having difficulty finding an efficient way of gathering the images. Here's exactly what I want to do:

1. Have this process or macro recognize the image I want to save (for easiness' sake, let's just say the PA strip.).
2. Save this image to a pre-designated folder.
3. Move to the next comic in the series, basically by clicking the "next comic" button or whatever the website has (all of the comics have an easy image link for clicking, no java screwiness.)
4. Repeat steps 1-3 until there is no "next comic" button, or the button does nothing.

For my own ease, I'd prefer a Mac-friendly way of doing this, but I can work with PC. I just made need some additional instructions for the PC side. I'm not too good with some of the technical aspects of PC's.

To let you know what I've tried, I've attempted to use the FireFox extension DownThemAll, but it's not really doing what I'd like it to do. If it's the only way, that's fine, but I may need a bit of assistance getting the extension to work.

The goal of all this is to eventually put these onto my iPhone or iPod Touch, and fill boring parts of the day. As I said, no profit or wide distribution. Purely my own amusement.

Help?

hoodie13 on
3DS Friend Code: 4398-9162-1823 ||| PSN: HoodieThirteen ||| XBL: Torn Hoodie ||| @hoodiethirteen

Posts

  • BarrakkethBarrakketh Registered User regular
    edited August 2009
    A combination of DownThemAll and AutoPager will probably do the job just fine. AutoPager is user-extensible so you can create rules for each individual comic, and once you load each page (it's basically appended to the current page) use DTA to download the images.

  • MagicToasterMagicToaster Registered User regular
    edited August 2009
    Wouldn't that eat up a lot of the web page's bandwidth?

    tostadas.png
  • BarrakkethBarrakketh Registered User regular
    edited August 2009
    Wouldn't that eat up a lot of the web page's bandwidth?
    The same amount as doing things manually, just over a shorter period of time. You can narrow specify what sections to load and what to admit via XPath (just like how you select the link). The penny-arcade.com comic page is 9KB, so if you just say that they started in 1998 and they've been going at it for about 9 years while maintaining an output of three comics a week that should come out to 12.3 megabytes of plain HTML (which should be compressed so in reality that number will be lower for bandwidth purposes).

    Then add up all the images. You should really only allow DTA to download one image at a time to be polite. If it has a bandwidth limiter than I'd use that too and just be patient.

  • JasconiusJasconius sword criminal Flo-ridaRegistered User regular
    edited August 2009
    You scraping a site for images is not going to kill the server unless they are hosted on Tripod or something.

  • ascannerlightlyascannerlightly Registered User
    edited August 2009
    Jasconius wrote: »
    You scraping a site for images is not going to kill the server unless they are hosted on Tripod or something.
    i <3 geocities

    armedroberty.jpg
  • PracticalProblemSolverPracticalProblemSolver Registered User
    edited August 2009
    A simple *insert favorite scripting language here* script combined with wget would handle it much better than doing anything by hand. You just need to figure out how the page is written or the images named, if you can figure out the image naming process it's best to skip the page loading and get the image directly.

    actually here's a program to do it for you, with 945 supported comics and the ability to define custom ones: http://collector.skumleren.net/supported_comics.php?version=devel

  • kathoskathos Registered User
    edited August 2009
    Yeah downloading all those delicious cake pictures all at once into one folder really helps out a lot ;).

    Kekekekekeke.

    Brlito.png
  • AwkAwk Registered User regular
    edited August 2009
    after ~10 years theyre closing my geocities account! ;(

    the internets are changing!

  • ÆthelredÆthelred Registered User
    edited August 2009
    You're going to need some sort of macro software. I would recommend AutoHotKey, which I know for sure could do what you want with a little scripting, but you're on a Mac. Try QuicKeys, Keyboard Maestro or HotApp; although I haven't used any of them myself.

    Also, if it's a popular webcomic you're after, search for a torrent of it. I found ones for Penny-Arcade and just downloaded those a while ago.

    pokes: 1505 8032 8399
  • JNighthawkJNighthawk Registered User
    edited August 2009
    http://www.httrack.com/ - lets you download a full copy of a website.

    Game programmer
  • EtheaEthea Registered User regular
    edited August 2009
    This is pretty easy using python/perl since the majority of webcomics index the images based on the day it was posted. So you just keep changing the image request based on the day you want. This allows you to grab all the images faster.

  • hoodie13hoodie13 Registered User regular
    edited August 2009
    Barrakketh wrote: »
    A combination of DownThemAll and AutoPager will probably do the job just fine. AutoPager is user-extensible so you can create rules for each individual comic, and once you load each page (it's basically appended to the current page) use DTA to download the images.

    Thanks a ton! This suggestion worked wonders. It took a little bit of effort to get AutoPager to work, but once I did this process worked like a dream.

    Thanks a lot, guys!

    3DS Friend Code: 4398-9162-1823 ||| PSN: HoodieThirteen ||| XBL: Torn Hoodie ||| @hoodiethirteen
Sign In or Register to comment.