The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
The Guiding Principles and New Rules document is now in effect.

So I want to download Wikipedia...

EinEin CaliforniaRegistered User regular
edited June 2007 in Help / Advice Forum
I'm in a class right now in a building on my campus devoid of wireless, and I usually take notes on my laptop. The lack of internet is a real pain for me, because I find myself intensely wishing I could get onto something like wikipedia while I was there - the professor asks a lot of trivia-esque questions about the legal system that nobody knows, and it'd be great if I could produce some sort of answers for him while I was there during the lectures.

To that affect, I'm trying to figure out how to download wikipedia. They seem to have the facilities set up for an individual to do so:

http://download.wikimedia.org/

But when I get this far I find myself stumped. I am assuming that the XML data is of no particular use to me, considering that I am fairly computer-retarded. However, the 'static html dumps' option seems ideal, assuming that it simply downloads a copy of each page to my hard drive.

I get myself to here...

http://static.wikipedia.org/downloads/April_2007/en/

For the English download, and I am confronted with a bunch of .7z files and .lsts. This is where I get confused. Presumably one could (and I have) download the .7z files and unzip them as an archive, but when I tried that I only got as far as articles with the letter B - nothing beyond that seemed to unzip. I also have no idea what these .lst files are there for.

Can you guys help me out? It'd be great to have access to something as expansive as wikipedia offline.

Ein on

Posts

  • AngelHedgieAngelHedgie Registered User regular
    edited June 2007
    I'd look for the Wikipedia CD project, myself. Why it won't have everything, it'll have most major articles.

    AngelHedgie on
    XBL: Nox Aeternum / PSN: NoxAeternum / NN:NoxAeternum / Steam: noxaeternum
  • anableanable North TexasRegistered User regular
    edited June 2007
    There's also utilities that allow to download entire sites. Try checking download.com for something like that. I would suggest finding one that allows you to download the html without images because the Wikipedia database is huge, and I'm not sure if you want this thing taking up dozens of gigabytes of your laptop's hard drive.

    anable on
  • EinEin CaliforniaRegistered User regular
    edited June 2007
    Well, the laptop is my school machine - it's just a way to take notes and all of that - and I have about 40 gigs free. I've seen the Wikipedia CD project, and it's nice, but 2,500 choice articles probably doesn't cover the stuff we talk about in the lecture - for example, specific legal cases from the early 1900's. I can find the material I think I need to download, I just can't quite figure out how to operate it once it's been downloaded.

    Ein on
  • JaninJanin Registered User regular
    edited June 2007
    .lst files are merely lists of which files are contained in what archives. .7z files are 7-Zip archives.

    Janin on
    [SIGPIC][/SIGPIC]
  • EggyToastEggyToast Jersey CityRegistered User regular
    edited June 2007
    He asks the questions because he doesn't know the answer and is wondering if the students know? Or he's asking the question to see if anyone is familiar with something he's going to say?

    Lots of professors ask questions without the expectation for students to answer. I'm curious if the professor really wants one kid to read from Wikipedia whenever he asks an obscure legal question.

    EggyToast on
    || Flickr — || PSN: EggyToast
  • EinEin CaliforniaRegistered User regular
    edited June 2007
    EggyToast wrote: »
    He asks the questions because he doesn't know the answer and is wondering if the students know? Or he's asking the question to see if anyone is familiar with something he's going to say?

    Lots of professors ask questions without the expectation for students to answer. I'm curious if the professor really wants one kid to read from Wikipedia whenever he asks an obscure legal question.

    There's a lot of conversations that he basically concludes by pointing at me with my laptop, since I am the only one in there with one, and saying "I wish we had the internet so we could find out a bit more about _______, but...". I'm just trying to pleasantly surprise him once or twice. It's not like I'll be lording over my laptop pretending to know everything - I'm just going to let him know that if he wants information on something, I'm handy.

    Ein on
  • oncelingonceling Registered User regular
    edited June 2007
    So just out of curiousity, if you downloaded the HTML files, all of them, how much space does it take up?

    onceling on
  • FristleFristle Registered User regular
    edited June 2007
    anable wrote: »
    There's also utilities that allow to download entire sites. Try checking download.com for something like that. I would suggest finding one that allows you to download the html without images because the Wikipedia database is huge, and I'm not sure if you want this thing taking up dozens of gigabytes of your laptop's hard drive.

    The site's host probably wouldn't appreciate you downloading everything this way, because it would take much more bandwidth than if you downloaded compressed archives.

    Fristle on
    Fristle.jpg
  • JaninJanin Registered User regular
    edited June 2007
    onceling wrote: »
    So just out of curiousity, if you downloaded the HTML files, all of them, how much space does it take up?

    About 7.3 gigs, compressed. I didn't bother to figure out the uncompressed amount, but it's a lot.

    Janin on
    [SIGPIC][/SIGPIC]
  • LewishamLewisham Registered User regular
    edited June 2007
    http://users.tkk.fi/~tkarvine/tero-dump/

    From

    http://en.wikipedia.org/wiki/Wikipedia:Database_download

    Also, the Wikipedia CD is http://en.wikipedia.org/wiki/Wikipedia:Wikipedia-CD/Download

    Seriously, this was all on Wikipedia by just googling "Download wikipedia" :)

    Lewisham on
Sign In or Register to comment.