Simple statistics question

CuddlyCuteKitten · August 2013

Hello!

I'm currently doing some quality checks for my clinic and I have serious doubts if the sample sizes where supposed to use are anything more than laughable.
The problem is that even I have some general knowledge of statistics I'm still pretty noob.

Basically I'd like to do a power calculation on sample size for the very simple test I'm doing.

Essentially I'm just checking 20 procedures done 3 years ago and seeing if they are still OK today. (Average expected time for the fillings to last is 11 years if that helps).

20 cases seems reaaaaaaaaaaally low. Unfortunantly dentists doesn't get a ton of stat education in med school here so I can't find the data to show my bosses.

Anyone who could point me in the right direction? I'd like a p value of 0,05-0,10 at least.

Thanks.

SilverEternity · August 2013

How many procedures total were done, ie. what is the population size?

mumbly_pie · August 2013

Hey CCK,

I'll try to point you in a few directions, with the hope that one of them is right (or at least useful). I'll go down a few roads, and talk a little about power analyses when I think they fit in. I'm going to (mostly) limit myself to the null-model-checking mode of doing statistics, even though it isn't really how I like to think about things.

1. The "normal" first thing to do in any stats problem is say what question you're trying to answer. You say that you're looking at 20 procedures done 3 years ago, and counting how many of them are still ok. Presumably, you would like to infer from this information something about how long other procedures will last. For example, you might want to ask "is it likely that my procedures are lasting, on average, at least 11 years?" That might be a fairly hard question to answer with much confidence. An easier question might be "what are the chances that more than 10% of procedures last less than 3 years?" To be really helpful, I think we need some information about this.

2. The "normal" next step is to model the problem. The model you choose depends on what you think about the type of data you have, as well as the question you're answering. If you're looking at the second question above, you can model each procedure as independent, with some probability $p$ of having failed after 3 years. The data you have (whether something has failed after 3 years) tells you everything you need to know about $p$ in this model, and nothing else about the details of the failures really matter. If you're interested in the first question, about failures after 11 years, you need more information about what's going on. For example, you might model each procedure as having a failure time that has Poisson distribution with unknown parameters. In this case, knowing about the distribution of failures after 3 years lets you estimate the distribution of failures after 11 years. Note that the previous model, with iid random variables describing failures after three years, didn't let us say anything about what happens after 11 years. This is why I called the first question "harder" than the second - the answer depends much more on the model, and with a small amount of data, this dependency can be hard to understand sensibly.

3. Next, we do some calculations! This is where power calculations might come in. Lets look again at the second question from part 1 again. Say you're pretty sure that fewer than 5% of procedures last for under 3 years, and you want to show with p<0.1 that less than 10% of procedures last under three years. Of course, there's some chance that you'll get an "unlucky sample" even if your guess is right, so you need to specify some acceptable chance that your experiment won't work, even if your guess is right - say 20%. You would need to look at some number $N$ of procedures. How big should $N$ be? This is the question that power calculations, eg. on wikipedia's page on this subject, answer. You can plug the above numbers into that page and get a sensible answer.
However, its worth pointing out that there isn't a webpage for how to do this type of power calculation for the first question. So, how do you figure out what N should be in that case? There are a lot of answers, so I'll just give my favorite answer. I think its also generally a better answer than looking at power formulae, even when they do exist, but it requires a little bit of familiarity with a computer. You simulate your whole study, and see what is likely! For example, lets say you want to look at $N=50$ procedures, and you think that each one has a 5% chance of failure each year. Just type that into a computer (in the R programming language, this is the single line data = rbinom(50,0.05) ) to get simulated data. Then, run your statistical analysis of the resulting fake data, exactly as if it were real data (in R, if you really like simulations, you might write:
CanIReject<-function(data) {
check = 0
for( i in 1:1000) {
check += (rbinom(50, mean(data)) > 50*0.1)
}
check/1000
}

The result of CanIReject(data) is an approximation of the p-value associated with your check that more than 10% of procedures fail after ten years, given the model and your simulated data.)

To do a power test, then, you just run your CanIReject test many times. (Again, in R:

PowerEstimate<-function(DesiredP) {
check = 0
for(i in 1:1000) {
data = rbinom(50,0.05)
check += (CanIReject(data) > DesiredP)
}
check/1000
}

The result is an estimate of the percentage of the time that the experiment you run will fail to reject at significant DesiredP, if all of your other assumptions are right.
)

As always, no guarantees that code written in 5 minutes in a browser and never run will be correct - this was just meant to be an outline.

I hope some of this is helpful, if only in helping you decide what you really want to do!

Best,
MP

Fuzzy Cumulonimbus Cloud · August 2013

You don't need to do a power test a priori since you already have an n=20.
You need to do a repeated measures ANOVA if your data is normally distributed or a kruskahl-wallis ranked test if your data is abnormally distributed.

Fuzzy Cumulonimbus Cloud · August 2013

Basically, whatever test you ultimately use will tell you whether or not you have enough statistical power to say anything about your data set.

MrTLicious · August 2013

A quick note about power calculations: They only apply to the exact setting where you do them, in the same way that the results of a Monte Carlo experiment only apply to the exact parameters tested. So, in mumbly_pie's example, that calculation only applies if the true rate of failure is 5%. Obviously, other similar rates of failure will result in similar calculated powers, but the premise behind them is that your guess of the failure rate is close to the true one.

The other issue you are going to run into is that, if your failure rates at 3 years are very low, there's an extremely good chance that you're going to get 0 failures out of 20. For example, if the failure rate is 5%, then there's a 36% chance that you'll get 0 failures. It's going to be really hard to say anything about really low probabilities, and if you do end up getting 0 failures, you won't be able to say anything (except that the failure rate is likely below 15% or something). That said, if you get 3 failures, you'd be able to reject the possibility that the failure rate is below 5% at 10% significance.

This means that all of your power is going to come from the upside (in terms of failures).

It's hard to be more specific without knowing exactly what you're trying to ask/answer.

If you're looking for a confidence interval on the reliability of these procedures after 3 years, that's pretty easy. You just need to find the probabilities that make a binomial cdf 5% and 95% with 20 trials and however many failures you had (this is easily done in Excel or various free programming tools that I can help with if you aren't familiar with that distribution/functions).[/quote]

In terms of "statistics" in the way it's normally thought of (i.e. central limit theorem based stats), 20 is really low, especially given that your outcomes are binary (I'm actually not sure that's right - I'm assuming based on your description). That said, the fact that it's binary also means you can model it without relying on the CLT.

Penny Arcade

Quick Links

Simple statistics question

Posts