As was foretold, we've added advertisements to the forums! If you have questions, or if you encounter any bugs, please visit this thread: https://forums.penny-arcade.com/discussion/240191/forum-advertisement-faq-and-reports-thread/

I need to download a website. All of it.

powersspowerss Registered User regular
edited September 2007 in Help / Advice Forum
I don't have FTP access, obviously. I'm looking for a MAC OS X application that will simply suck down a website and it's links/linked files to a directory.

TIA.

powerss on

Posts

  • Jimmy KingJimmy King Registered User regular
    edited September 2007
    wget is what you need. I would think it comes with OS X as it's a fairly standard *nix utility. Get to a command prompt and type 'man wget'. On gnu wget you use the -r flag. I don't know if the flag will be the same on OS X or not, though. Sometimes they are, sometimes they aren't.

    Jimmy King on
  • powersspowerss Registered User regular
    edited September 2007
    Jimmy,

    Here's what I get in the OSX Terminal:
    sXXXXo-sXXXXs-computer:~ SXXXXX$ man wget
    No manual entry for wget
    

    powerss on
  • Jimmy KingJimmy King Registered User regular
    edited September 2007
    Bah. Turns out OS X only comes with cURL, which is similar, but I don't think provides the functionality you need. Instructions to install wget on your machine about halfway down the page here. I don't have a Mac handy to test it on, so hopefully these instructions are correct. Looks pretty straightforward.

    Jimmy King on
  • ApexMirageApexMirage Registered User regular
    edited September 2007
    is there a pc version?

    ApexMirage on
    I'd love to be the one disappoint you when I don't fall down
  • DaedalusDaedalus Registered User regular
    edited September 2007
    ApexMirage wrote: »
    is there a pc version?

    You mean a Windows version? Yeah, a windows port of GNU wget is here: http://gnuwin32.sourceforge.net/packages/wget.htm

    (obviously, this type of thing typically comes preinstalled in most Linux distros)

    Daedalus on
  • FirebrandFirebrand Registered User regular
    edited September 2007
    If you're a FireFox user, there's an extension called Scrapbook (https://addons.mozilla.org/en-US/firefox/addon/427). I'm not sure if it's Windows-only or if it's cross-platform.

    Firebrand on
  • PheezerPheezer Registered User, ClubPA regular
    edited September 2007
    cURL will do what you need it to do.

    Pheezer on
    IT'S GOT ME REACHING IN MY POCKET IT'S GOT ME FORKING OVER CASH
    CUZ THERE'S SOMETHING IN THE MIDDLE AND IT'S GIVING ME A RASH
  • Jimmy KingJimmy King Registered User regular
    edited September 2007
    DrDizaster wrote: »
    cURL will do what you need it to do.
    Not unless I've missed something on the man page and in my testing. cURL will pull the html and whatnot from the url specified. It doesn't grab images, .js files, .css, follow links and create the directory structure for those links and .js files, etc. That is what wget -r does. Obviously one could manually do all that or write a script to do it, but that seems like a bit of a waste when there are truly automated ways to do it already.

    If I've missed something and you've got the flags to make cURL grab all the content, though, post it on in here. It could come in handy.

    Jimmy King on
  • Legoman05Legoman05 Registered User regular
    edited September 2007
    Does wget grab PHP pages as well? Such as this forum, with all the ?do=newreply&treadcount=20 sillyness?

    Legoman05 on
  • Jimmy KingJimmy King Registered User regular
    edited September 2007
    Legoman05 wrote: »
    Does wget grab PHP pages as well? Such as this forum, with all the ?do=newreply&treadcount=20 sillyness?
    In theory, yes, but there seems to be something more to it that I'm not immediately grasping about it.

    For example I can pull my entire website and get everything. While it's Perl CGI rather than PHP, it functions in the same way... the main script has the same name and then there are just parameters after the name to tell it what to show. I can also do it on a phpbb that I run. When I do it on http://forums.penny-arcade.com/index.php, though, I just get the index.php and nothing else. I'm guessing it's something authentication and/or cookie related.

    Jimmy King on
  • Jimmy KingJimmy King Registered User regular
    edited September 2007
    powerss, I realized that you might want to use the -k or -m options with wget, depending on what you are needing to do. -k will pull the site down but modify all of the links so that they work locally. -m will pull it down but modify all of the links so that you can just stick it up on a server as a mirror.

    Jimmy King on
  • capable heartcapable heart Registered User regular
    edited September 2021
    deleted

    capable heart on
  • Apothe0sisApothe0sis Have you ever questioned the nature of your reality? Registered User regular
    edited September 2007
    You can implement cookie and authentication things in wget.

    Apothe0sis on
Sign In or Register to comment.