chhotii: (Default)
[personal profile] chhotii
So I was just reading about statistics, and read the admonishment that if the scale of measurement of your data isn't really a numeric linear scale-- for example, a ratings scale-- you shouldn't use the t test, but have to use Mann-Whitney U or some such thing.

Then I went to the talk. The experiment involved a lot of having the subjects fill out ratings scales ("On a scale of 1 to 5, how anxious do you feel?"). The results were presented as a bunch of bar graphs with error bars and ** on very significantly different bars and p values festooned all over the place. I had the sinking feeling that this fellow probably used the t test everywhere. Ideally I would have asked, "What test did you use to get these p values-- t test or Mann-Whitney or what?" But I didn't want to look like a smart-ass, and we are trying to be diplomatic with this fellow, apparently. So that was kind of irritating.

Those of you who really know stats must run into this kind of irritation all the time.

statistical smart-assery

Date: 2007-06-26 07:49 pm (UTC)
From: [identity profile] happyfunpaul.livejournal.com
Personally, I feel that you could make a good argument either way. The statisticians will indeed say that a rating scale is merely an ordinal scale (1 is smaller than 2, 2 is smaller than 3) but not an interval scale (you can't say that the distance from 1-to-2 is the same as the distance from 2-to-3). And I see their point. But I also think that if you present the scale and the task to the subject in the proper way, it's reasonable to treat the data as being on an interval scale. I used both in different studies in grad school.

Though, to be honest, I can't recall which way I did it on my Ph.D. thesis. :-)

Actually, I think the only time I really got cranky on the topic was when someone from a customer survey firm was scoffing at a survey redesign that [livejournal.com profile] ratatosk and I came up with at Akamai. I was much more concerned with (1) sampling bias and (2) violation of Gricean conversational rules in the questions (just because a question's wording is "standard practice" doesn't mean you're getting useful results from it).

Date: 2007-06-26 09:17 pm (UTC)
From: [identity profile] quietann.livejournal.com
Basically I agree with Paul. No, it's not 100% correct but for all practical purposes it should not make much of a difference, so long as the rating is roughly normally distributed. (Counterexample: if you have a 5 point scale and 80% of your people rate it 5, that would not be a place where a t-test should be used. What I would do is turn it into a binary variable: rated 5 versus not rated 5, and then use a chi-square.)

Also, multiple-item tests tend to "smooth out" such that the distribution of scores is more like a bell-curve.

Mann-Whitney U and other non-parametric tests (such as chi-square, Fishers Exact test, Kruskall-Wallis ANOVA etc.) are advantageous because they are "non-parametric" which means that they do not assume that the population data have any particular underlying distribution. But if one has, say, data that actually are normally distributed, they are less powerful than a test that assumes an underlying normal distribution.

Wikipedia is pretty good on this topic: http://en.wikipedia.org/wiki/Non-parametric_statistics

Profile

chhotii: (Default)
chhotii

July 2023

S M T W T F S
      1
2345678
9101112131415
16 171819202122
23 242526272829
3031     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 27th, 2026 07:10 am
Powered by Dreamwidth Studios