As was foretold, we've added advertisements to the forums! If you have questions, or if you encounter any bugs, please visit this thread: https://forums.penny-arcade.com/discussion/240191/forum-advertisement-faq-and-reports-thread/
Options

Help me relearn statistics

AresProphetAresProphet Registered User regular
edited August 2007 in Help / Advice Forum
I somehow got a 5 on the AP stats test back in high school, and I can't remember a damn thing from the class. Mostly, I just need to know how to solve a specific type of problem (it's not in any way school-related, nor is it gambling-related despite the example), though a site that gives a general overview of stats math would be really nice. The problem is simplified below.

Say I'm playing a shell game, where someone puts a coin under one of three shells at random, shuffles them, then I get to guess which one has the coin under it. The probability of being right p = .33, that's easy. I can also figure out the probability of losing/winning a certain number of times in a row (p^n and [1-p]^n). What I can't remember is how to calculate what the odds are of getting a certain number of guesses right, with a fixed n. How can I calculate the probability of winning exactly 4 out of 15? Or more than 3 out of 10? Or less than five out of twenty?

I should remember this, and I don't, and it's embarassing.

ex9pxyqoxf6e.png
AresProphet on

Posts

  • Options
    GdiguyGdiguy San Diego, CARegistered User regular
    edited August 2007
    The most understandable way: the odds of getting success on 4 specific trials (say, trials 1, 2, 3, and 4) and failure on the others is simply p^4 * (1-p) ^ (15-4) (which is just the probability of trial 1 outcome * probability of trial 2 outcome **** etc etc, so you'll have 4 p's and 11 1-p's).

    The overall chance of getting 4 on any combination of trials, then, is the number of 4 element subsets of 15, which is the n choose k binomial factor (n! / ( (n-k)! k!) ). The link below explains it a bit more, but basically you have 15*14*13*12 possible arrangements if you choose without replacement (the trials that were successful, first could be any of them, second is any but the one you already picked, etc), but then you have to divide by some factor because in your example, you don't care whether trial 1 or trial 2 was chosen for success first (i.e., success on trials 1, 3, 5, and 7 is the same as 7, 5, 3, 1).

    So it's 15! / (11 ! * 4!) * p^4 * (1-p) ^ (15-4)

    (http://en.wikipedia.org/wiki/Binomial_coefficient)

    Something like "more than 3 out of 10" is usually a pain in the ass - the only way I really know of calculating it exactly is just to sum up the probability of 4, 5, 6, ... etc (or 1 - sum from 0, 1, 2, 3 if that's easier to calculate, which is the equivalent problem)

    Gdiguy on
  • Options
    GoodOmensGoodOmens Registered User regular
    edited August 2007
    You might also want to check http://faculty.vassar.edu/lowry/binomialX.html, which does the calculations for you, if you're just interested in getting an answer quickly.

    GoodOmens on
    steam_sig.png
    IOS Game Center ID: Isotope-X
  • Options
    AresProphetAresProphet Registered User regular
    edited August 2007
    GoodOmens wrote: »
    You might also want to check http://faculty.vassar.edu/lowry/binomialX.html, which does the calculations for you, if you're just interested in getting an answer quickly.

    That page is incredibly helpful, thanks. I didn't think the calculations would be so messy to do by hand. Shows what I remember....

    There's one other thing I know stats can help with, but I can't remember this either. Say I have a set of data where I don't know the probability of something happening, and I want to get a decent guess of it. I'll use some actual data I've collected for this example, with an unknown p:

    6 out of 29 (.207)
    10 out of 42 (.238)
    14 out of 50 (.280)
    8 out of 27 (.296)
    5 out of 15 (.333)

    Total: 43 out of 163 (.263)

    Let's say I want to assume that p = .25 for this data. What are the odds that my data could show a p of .263 and simply be random chance, although the real p = .25? What if I assume p =.26? .30?

    AresProphet on
    ex9pxyqoxf6e.png
  • Options
    senor_xsenor_x Registered User regular
    edited August 2007
    You may try looking more into Normal Distributions. They're a little more complicated to set up, but all the calculations are tabularized and easy once you normalize everything. For your second post, it looks like you're getting into Random Variable territory. You can find some Confidence Intervals and perform some Hypothesis Testing for the means and stuff to get a sense of the "goodness" of the data. Considering that I've taken two college Statistics & Probability for Engineers courses and one refresher course for work, I should be able to provide more explicit guidance, but I'm a systems engineer now and haven't calculated anything in a non-academic setting in seven years.

    senor_x on
    Senor10.gif Wii 1490 9129 8407 5923
  • Options
    AresProphetAresProphet Registered User regular
    edited August 2007
    senor_x wrote: »
    You may try looking more into Normal Distributions. They're a little more complicated to set up, but all the calculations are tabularized and easy once you normalize everything. For your second post, it looks like you're getting into Random Variable territory. You can find some Confidence Intervals and perform some Hypothesis Testing for the means and stuff to get a sense of the "goodness" of the data. Considering that I've taken two college Statistics & Probability for Engineers courses and one refresher course for work, I should be able to provide more explicit guidance, but I'm a systems engineer now and haven't calculated anything in a non-academic setting in seven years.

    This is what I vaguely having to remember doing for this kind of problem; if the true mean (the peak of the curve) is a certain number, I can make assumptions about the shape of the curve (calculate the standard deviation) and then get a confidence interval that includes the mean I got from my data (which has a rather small sample size)

    I just don't remember how to go about doing it. I'm doing a lot of Googling on the subject, and I can't remember what the hell all the variables in stats are supposed to be.

    Edit: whipped out my old TI-83+ to help with this. If anyone knows how to do what I need on this, that'd work too.

    Edit 2: there's another way to think of this problem. The data is from running multiple instances of a binary distribution. I only have data collected from 5 points in this test (after 29, 71, 121, 148, and 163 runs) but each one was done independently. So it's a like a coin toss with a weighted coin that lands on one side more often than the other. I'm trying to calculate just how weighted the coin might be, based on my data; I want to be able to assume anything from a 1:9 to a 4:6 ratio of tosses.

    AresProphet on
    ex9pxyqoxf6e.png
Sign In or Register to comment.