The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
The Guiding Principles and New Rules document is now in effect.
Hey I'm trying to use the Mann-Whitney U test to analyse two sets of numbers, however even though the sets are of equal lengths I'm getting a different U value depending on which set of numbers is first which according to my notes, should not happen.
Can anyone help me understand what is happening here, I don't know if my notes are wrong or what because I have used the Vassar online tool and it's still a different U number.
The size of the two sets should not matter, because you're testing to see if the two samples you have are from the same population. The amount of observations does not matter.
I have no experience with Vassar (SPSS baaaay-beeeh!) but for sanity's sake it is best to start off with the population that appears to have the smallest ranks. How big is the difference between the 2 U values anyway, minor differences can happen depending on with what population you start. Most sane people start with the population that appears to have the lowest ranks, although it doesn't matter when you're not doing it by hand.
The two U values are 56 and 200 so theres quite a difference.
I've worked it out one way by hand and got 56, so I guess i'll try it the other way and just check if i get 200.
I'm not particularly fussed about why this is happening, i just have a 2000 word report to hand in tomorrow and it would be going a lot smoother if these numbers matched up...
If i can't figure out whats up i'll probably just blow past it anyway and take a few lost marks as its not really the main point of the essay.
Oh wait, after reading through the wiki some more, I think this example should be of interest to you:
Suppose that Aesop is dissatisfied with his classic experiment in which one tortoise was found to beat one hare in a race, and decides to carry out a significance test to discover whether the results could be extended to tortoises and hares in general. He collects a sample of 6 tortoises and 6 hares, and makes them all run his race. The order in which they reach the finishing post (their rank order) is as follows, writing T for a tortoise and H for a hare:
T H H H H H T T T T T H
What is the value of U?
* Using the direct method, we take each tortoise in turn, and count the number of hares it beats, getting 6, 1, 1, 1, 1, 1. So U = 6 + 1 + 1 + 1 + 1 + 1 = 11. Alternatively, we could take each hare in turn, and count the number of tortoises it beats. In this case, we get 5, 5, 5, 5, 5, 0, which means U = 25. Note that the sum of these two values for "U" is 36, which is 6 × 6.
[...]
The U value tells you how much observations of one set are of a higher rank than those of the other set. So in the Hare/Tortoise example the Hares have a higher score, so they are *usually* faster. In your test the one with the 200 value is scoring higher than the one with the 56 score. Can you cross-check to see if this makes sense?
Also: if both sets have more than 30 observations you should just do a t-test and be done with it, the results are much better.
Right so my 200 score is for the set which you can tell just by glancing at as is clearly superior, therefore in my case that was the experimental group which suggests that they clearly benefited from the alteration they recieved over the control group.
Right so my 200 score is for the set which you can tell just by glancing at as is clearly superior, therefore in my case that was the experimental group which suggests that they clearly benefited from the alteration they recieved over the control group.
I think I have this now...
That sounds like a reasonable conclusion. Good luck and glad I could help. Now excuse my while I continue forgetting more details about statistics and start memorizing the place I've placed my textbook.
Right so my 200 score is for the set which you can tell just by glancing at as is clearly superior, therefore in my case that was the experimental group which suggests that they clearly benefited from the alteration they recieved over the control group.
I think I have this now...
I learned Mann-Whitney in a slightly different way than it seems most online places have the U test, but the theory is the same - the U value is not ever going to be equal between the two sets, because they're used to query different hypotheses. If you're doing a one-sided test whether experimental is greater than control, then your significance test is testing P(Uexperimental>=200), which is equivalent to testing the converse P(Ucontrol<=56)
You can do the test with either number, and if it's a two-sided test it doesn't really matter, but the reason you do something like "always take the lower value for a two-sided test" is because the significance test you want is P(Ueither <= smaller value); if you used the larger value you'd need to test P(U >= larger value) instead
Posts
I have no experience with Vassar (SPSS baaaay-beeeh!) but for sanity's sake it is best to start off with the population that appears to have the smallest ranks. How big is the difference between the 2 U values anyway, minor differences can happen depending on with what population you start. Most sane people start with the population that appears to have the lowest ranks, although it doesn't matter when you're not doing it by hand.
If you don't have a textbook at hand you might find the wiki entry useful http://en.wikipedia.org/wiki/Mann-Whitney_U
I've worked it out one way by hand and got 56, so I guess i'll try it the other way and just check if i get 200.
I'm not particularly fussed about why this is happening, i just have a 2000 word report to hand in tomorrow and it would be going a lot smoother if these numbers matched up...
If i can't figure out whats up i'll probably just blow past it anyway and take a few lost marks as its not really the main point of the essay.
The U value tells you how much observations of one set are of a higher rank than those of the other set. So in the Hare/Tortoise example the Hares have a higher score, so they are *usually* faster. In your test the one with the 200 value is scoring higher than the one with the 56 score. Can you cross-check to see if this makes sense?
Also: if both sets have more than 30 observations you should just do a t-test and be done with it, the results are much better.
I think I have this now...
I learned Mann-Whitney in a slightly different way than it seems most online places have the U test, but the theory is the same - the U value is not ever going to be equal between the two sets, because they're used to query different hypotheses. If you're doing a one-sided test whether experimental is greater than control, then your significance test is testing P(Uexperimental>=200), which is equivalent to testing the converse P(Ucontrol<=56)
You can do the test with either number, and if it's a two-sided test it doesn't really matter, but the reason you do something like "always take the lower value for a two-sided test" is because the significance test you want is P(Ueither <= smaller value); if you used the larger value you'd need to test P(U >= larger value) instead