The new forums will be named Coin Return (based on the most recent
vote)! You can check on the status and timeline of the transition to the new forums
here.
The Guiding Principles and New Rules
document is now in effect.
Hey folks,
I'm currently working on a custom random loot generator for my DnD group. I want my loot drops to work on a bell shaped curve, so I'm using an RNG to generate 0 to 50 twice, and 1 to 50 twice, add all four numbers together and divide by 2 (to normalize the distribution). From this I know I can calculate the standard deviation (about 15) and use the 68-95-99.7 percent rule to spit-ball about how often a range of numbers will come up.
However, I'd like to get a little bit better idea about how the probabilities are likely to break down more finely (say, how often a 10 will come up). Anyone know how to do this (or point me in the direction of a utility that can help me do this)? It doesn't have to be anything super accurate, a rough estimate is good enough for my purposes.
0
Posts
There are programs that can work this out for you but the only ones I am aware of are too expensive to consider for this unless you're dealing with heavy mathematical lifting every day. If you have a friend who's good with math you might be able to convince them to section your bell curve up in 10% increments in exchange for lunch.
edit: It might actually be easier to use the CDF. Depends on how you look at it I guess.
0431-6094-6446-7088
I think I see how this works, but that only works if your random variable is actually or effectively a whole number, yes?
0431-6094-6446-7088
Well, no.
You can measure the height of a graph at any point, whether that point is an integer or not. The X-axis is a number line, and the Y-axis are the probabilities of each of those possibilities. The only difference between whole numbers and a continuous spectrum on the x-axis is that you need calculus to confirm that the area under the curve is equal to 1. Measuring the height of the graph is a simple operation either way.
First, if you just want a bell-shaped curve, and don't specifically need it to behave like the sum of four uniformly distributed integers, you can use the NORMDIST function in excel to estimate the probabilities. You need to set the "cumulative" argument to TRUE. Then, to find out what fraction of the time you will get, say, between 10 and 11, you can do '=NORMDIST(11,mean,stdev,TRUE)-NORMDIST(10,mean,stdev,TRUE)'. This is in excel 2003.
Second, if you do want it to behave like integers, as you specified, just simulate it. You can use excel to approximate this, by putting a formula on each row to reflect your sum (ie '=(INT(RAND()*51)+INT(RAND()*51)+INT(RAND()*50)+1+INT(RAND()*50)+1)/2'), filling it down for 10,000 rows, then just counting what fraction of the time you have a 10 or 10.5. This kind of thoughtless simulation is generally my solution to all statistical problems.
0431-6094-6446-7088
Though mathematically true, you can still simply measure the height of the graph to get the information the OP wants.
For one thing, the OP is not asking about calculus, just how to get the probability of a given event occurring, and measuring the height of the graph at the point he's looking for will work regardless of whether he's dealing with a discrete or continuous variable.
Secondly, he's dealing with a discrete variable, so even if he wanted to know literally everything there was to know about this particular problem, continuous variables would be irrelevant anyway.
In the long term, a normal distribution will be relatively predictable, but if you only hand out loot less than a dozen times you still might get wild results. If you hand out loot at least 25 to 30 times it should be fine, though.
Nintendo ID: Pastalonius
Smite\LoL:Gremlidin \ WoW & Overwatch & Hots: Gremlidin#1734
3ds: 3282-2248-0453
Number Frequency Proportion
4 1 0.0001
5 6 0.0006
6 18 0.0018
7 19 0.0019
8 35 0.0035
9 40 0.004
10 68 0.0068
11 81 0.0081
12 93 0.0093
13 139 0.0139
14 158 0.0158
15 224 0.0224
16 254 0.0254
17 282 0.0282
18 355 0.0355
19 391 0.0391
20 438 0.0438
21 478 0.0478
22 469 0.0469
23 485 0.0485
24 519 0.0519
25 560 0.056
26 535 0.0535
27 506 0.0506
28 508 0.0508
29 469 0.0469
30 439 0.0439
31 401 0.0401
32 388 0.0388
33 306 0.0306
34 279 0.0279
35 249 0.0249
36 193 0.0193
37 161 0.0161
38 120 0.012
39 96 0.0096
40 85 0.0085
41 54 0.0054
42 33 0.0033
43 27 0.0027
44 15 0.0015
45 15 0.0015
46 6 0.0006
47 2 0.0002
SUM 10000 1
Nintendo ID: Pastalonius
Smite\LoL:Gremlidin \ WoW & Overwatch & Hots: Gremlidin#1734
3ds: 3282-2248-0453
So the answer to the question I was asking is yes.
0431-6094-6446-7088
No. Measuring the area under the curve is how you gain the probability of an outcome, regardless of whether it's continuous or discrete. The big difference is a discrete variable has a pre-set width, but a continuous variable has a variable width. Either way, you are still measuring the area under the curve though.
0431-6094-6446-7088
Important note: I'm going to give you the number of ways to get sums between 4 and 202 (i.e., the dice go from 1-50 and 1-51 instead of 1-50 and 0-50, and I'm not dividing by 2). This is just to make the formulas cleaner. To get a probability, translate my sum to yours (subtract 2 then divide by 2), then divide by the total number of outcomes (50^2)(51^2)
For x between 4 and 53, the number of outcomes is (x-1)(x-2)(x-3)/6
For x = 54, the number of outcomes is (x-1)(x-2)(x-3)/6 - 2 = 23424
For x = 55 to 103, the number of outcomes is (x-1)(x-2)(x-3)/6 - (x-52)(x-53) - 4*(x-52)*(x-53)*(x-54)/6
To figure out how many outcomes have a 51, just note that if one of the dice is a 51, then the rest of the dice have to sum to x-51. Thus, the question becomes: How many where are there for 3 dice to sum to x-51. Again, using the stars and bars method, we see that this is (x-52)(x-53)/2. Normally, we would multiply this number by 4 to get the number of outcomes that have a 51 (because there are 4 dice that could get a 51), and that number is included in the positive part of the formula. However, since 2 dice do not have a 51, we need to subtract out that number times 2, or (x-52)(x-53). This is a generalization of the step where we subtracted 2 in the formula where x = 54.
To figure out how many outcomes have a dice with a number higher than 51, we again translate the problem. Instead of asking, for example, how many ways there are for 4 positive integers to sum to 58 with exactly one being higher than 51, instead ask how many ways there are for 4 positive integers to sum to 7 (58-51), and multiply this number by 4. These turn out to be the same because for every outcome that sums to 7 (for example, {1, 3, 2, 1}), there are 4 ways to turn it into an outcome that sums to 58 by adding 51 to one of the dice ({52,3,2,1},{1,54,2,1},{1,3,53,1}, and {,1,3,2,52}). As before, the number of ways for 4 positive integers to sum to 7 is (7-1)(7-2)(7-3)/6, so the number of ways to get 58 with at least 1 being higher than 51 is 4(58-52)(58-53)(58-54)/6.
For x > 103, find the value for (206-x) and it will be the same. For example, the number of ways to get 202 is the same as the number of ways to get (206-202) = 4, or exactly 1 way.
Based on what your OP has, I'm going to guess that you have 100 items in your table that you want to draw from, and are approximating the normal distribution from the sum of four discrete uniform draws (see: central limit theorem). The distribution of (U[1,50]+U[1,50]+U[0,50]+U[0,50])/2 (where Ui]a[/i],[i]b[/i is a uniform draw of an integer between a and b inclusive) has a mean of 50.5 and a variance of 14.58. To estimate the probability of an outcome x (the set of outcomes being the integers from 1 to 100 inclusive and all half values in between - note that this is 199 values) then you want to calculate the z-scores for x-.25 and x+.25, evaluate the cumulative distribution function (CDF) on a standard normal distribution or look them up on the table, then take the difference between them
Example: What is the approximate probability that I roll a 40? My z-scores are (39.75-50.5)/14.58 = -0.703 and (40.25-50.5)/14.58 = -0.734. The CDF values are .2398 and .2306, and the difference, .0092, is the probability of rolling a 40. This value is going to be a little inaccurate since there's about .0006 in the tails that aren't covered, but it's a fair approximation if you want something quick and dirty.
I'll be happy to elaborate a bit more if necessary.
Since it sounds like he is trying to do something discrete I'm not sure that's what he would want to do, and maybe he wants doesn't even need to deal with the normal distribution. A binomial distribution might suffice. Perhaps someone else has a better idea what he is using this for to figure out what exactly sort of setup he would want.
What I'm doing is that I have an array of 100 items (labelled 1-100), and my formula spits out a whole number between 1 and 100 (any fractions are just dropped), and then reads that number from the array. I want the numbers around the center of the table (50) to occur much more frequently than the numbers around the outsides of the table (1,100), so I can position common loot in the center and more exotic and rare stuff along the outsides.
I do have access to higher-level statistical software (intercooled stada), but the suggestion to just run 10k simulations and use the outcome to guesstimate probabilities for each discrete number worked out well enough for my purposes. Good stuff.