The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
Please vote in the Forum Structure Poll. Polling will close at 2PM EST on January 21, 2025.
Alright, I've tried googling this. Apparently Google either thinks I'm trying to make AJAX listings with PHP, or it points me to Linux tutorials using a bash shell which I don't have in Windows.
What I'd like to do is save all these pages in order, starting with
The list is finite as Halo 2 is no longer playable online, so no new pages or anything will be showing up here. I need to download them since I'll be changing my gamertag, and the result of doing that may delete these listings.
So short of using Firefox's "Save Page" 267 times or installing Ubuntu again, is there a really nice way to grab all those pages and download them sequentially in Windows?
Dump the first url into excel, drag it down to create a list of all 267 urls, save it as a text file or html file (will probably need to do a find/replace of "a href=" tags in the html one to make sure links generate properly).
Then open the list with a download manager like Flashget to download them all.
It spits out a bunch of HTML in the command line (which looks like the correct page I'm looking for) and then spits out
- bunch of html that looks correct -
</div>
</body>
</html>
'sg' is not recognized as an internal or external command,
operable program or batch file.
'ctl00_mainContent_bnetpgl_recentgamesChangePage' is not recognized as an intern
al or external command,
operable program or batch file.
So I was like "wait, I think the & is messing it up" and converted all the &s and ?s to the html entities, saved it as a .bat to run:
It's now downloading the pages, but apparently I'm a fucktard and don't know how to encode the HTML entities correctly in Windows command line, as it's now obviously stripping the % and the first character that comes after it and leaving the rest, resulting in 404s (I've reduced the amount of pages it attempts to pull down while debugging this, don't worry about that part).
Seems to be working! Checked the output of a smaller batch and it's matching up with what I see on Bungie.net, so we can consider this solved as I slowly grab 267 pages. Thanks for the help Echo.
Spam, I'll save your strategy for times when I can't download or access curl. Thank you for that suggestion.
Posts
Then open the list with a download manager like Flashget to download them all.
*Edit*
In fact, here you go - http://dl.dropbox.com/u/4197966/Book1.htm - just drop that list into a download manager
Find a Windows version of curl, run
edit: here's a Windows binary
Grabbed that, ran it.
It spits out a bunch of HTML in the command line (which looks like the correct page I'm looking for) and then spits out
So I was like "wait, I think the & is messing it up" and converted all the &s and ?s to the html entities, saved it as a .bat to run:
Output:
It's now downloading the pages, but apparently I'm a fucktard and don't know how to encode the HTML entities correctly in Windows command line, as it's now obviously stripping the % and the first character that comes after it and leaving the rest, resulting in 404s (I've reduced the amount of pages it attempts to pull down while debugging this, don't worry about that part).
What am I doing wrong here?
Seems to be working! Checked the output of a smaller batch and it's matching up with what I see on Bungie.net, so we can consider this solved as I slowly grab 267 pages. Thanks for the help Echo.
Spam, I'll save your strategy for times when I can't download or access curl. Thank you for that suggestion.