Wednesday, November 28, 2007

Practical Issues in Power Analysis

Below, I've added a new chart, based on things we discussed in class. William Trochim's Research Methods Knowledge Base, in discussing statistical power, sample size, effect size, and significance level, notes that, "Given values for any three of these components, it is possible to compute the value of the fourth." The table I've created attempts to convey this fact in graphical form.


You'll notice the (*) notation by "S, M, L" in the chart. Those, of course, stand for small, medium, and large effect sizes. As we discussed in class, Jacob Cohen developed criteria for what magnitude of result constitutes small, medium, and large for correlational studies and those studies comparing means of two groups (t-test type studies, but t itself is not an indicator of effect size).

When planning a new study, naturally you cannot know what your effect size will be ahead of time. However, based on your reading of the research literature in your area of study, you should be able to get an idea of whether findings have tended to be small, medium, or large, which you can convert to the relevant values for r or Cohen's d. These, in turn, can be submitted to power-analysis computer programs and online calculators.

I try to err on the side of expecting a small effect size. This will have the effect of requiring me to obtain a large sample size, to be able to detect a small effect, which seems like good practice, anyway.

UPDATE: Westfall and Henning (2013) argue that post hoc power analysis, which is what the pink column depicts in the above table, is "useless and counterproductive" (p. 508).

Tuesday, November 27, 2007

Illustration of a "Miss" in Hypothesis Testing (and Relevance for Power Analysis)

My previous blog notes on this topic are pretty extensive, so I'll just add a few more pieces of information (including a couple of songs).

As we've discussed, the conceptual framework underlying statistical power involves two different kinds of errors: rejecting the null (thus claiming a significant result) when the null hypothesis is really true in the population (known as a "false alarm"); and failing to reject the null when the true population correlation (rho) is actually different from zero (known as a "miss"). The latter is illustrated below:



And here are my two new power-related songs...

Everything’s Coming Up Asterisks
Lyrics by Alan Reifman (updated 11/18/2014)
(May be sung to the tune of “Everything’s Coming Up Roses,” from Gypsy, Styne/Sondheim)

We've got a scheme, to find p-values, baby.
Something you can use, baby.
But, is it a ruse? Maybe...

State the null! (SLOWLY),
Run the test!
See if H-oh should be, put to rest,

If p’s less, than oh-five,
Then H-oh cannot be, kept alive (SLOWLY),

With small n,
There’s a catch,
There could be findings, you will not snatch,

There’s a chance, you could miss,
Rejecting the null hypothesis… (SLOWLY),

(Bridge)
H-oh testing, how it’s always been done,
Some resisting, will anyone be desisting?

With large n,
You will find,
A problem, of the opposite kind,

Nearly all, you present,
Will be sig-nif-i-cant,

You must start to look more at effect size (SLOWLY),
’Cause, everything’s coming up asterisks, oh-one and oh-five! (SLOWLY)

Believe it or not, the above song has been cited in the social-scientific literature:




Find the Power
Lyrics by Alan Reifman
(May be sung or rapped to the tune of “Fight the Power,” Chuck D/Sadler/Shocklee/Shocklee, for Public Enemy)

People do their studies, without consideration,
If H-oh, can receive obliteration,
Is your sample large enough?
To find interesting stuff,

P-level and tails, for when H-oh fails,
Point-eight-oh’s the way to go,
That’s the kind of power,
That you’ve got to show,

Look into your mind,
For the effect, you think you’ll find,

Got to put this all together,
Got to build your study right,
Got to give yourself enough, statistical might,

You’ve got to get the sample you need,
Got to learn the way,
Find the power!

Find the power!
Get the sample you need!

Find the power!
Get the sample you need!

Monday, November 12, 2007

Non-Parametric/Assumption-Free Statistics

(Updated November 11, 2013)

This week, we'll be covering non-parametric (or assumption-free) statistical tests (brief overview). Parametric techniques, which include the correlation r and the t-test, refer to the use of sample statistics to estimate population parameters (e.g., rho, mu).* Thus far, we've come across a number of assumptions that technically are required to be met for doing parametric analyses, although in practice there's some leeway in meeting the assumptions.

Assumptions for parametric analyses are as follows (for further information, see here):

o Data for a given variable are normally distributed in the population.

o Equal-interval measurement.

o Random sampling is used.

o Homogeneity of variance between groups (for t-test).

One would generally opt for a non-parametric test when there's violation of one or more of the above assumptions and sample size is small. According to King, Rosopa, and Minium (2011), "...the problem of violation of assumptions is of great concern when sample size is small (< 25)" (p. 382).

In other words, if assumptions are violated but sample size is large, you still may be able to use parametric techniques. We'll be doing a neat demonstration that conveys the role of large samples in salvaging data from the normal-distribution assumption, under what is known as the Central Limit Theorem (see further examples from this website).

To a large extent, the different non-parametric techniques represent analogues to parametric techniques. For example, the non-parametric Mann-Whitney U test is analogous to the parametric t-test, when comparing data from two independent groups, and the non-parametric Wilcoxon signed-ranks test is analogous to a repeated-measures t-test. This PowerPoint slideshow demonstrates these two non-parametric techniques. I've even written a song for the occasion...

Mann-Whitney U
Lyrics by Alan Reifman
(May be sung to the tune of “Suzie Q.,” Hawkins/Lewis, covered by John Fogerty)

Mann-Whitney U,
When your groups are two,
If your scaling’s suspect, and your cases are few,
Mann-Whitney U,

The cases are laid out,
Converted to rank scores,
You then add these up, done within each group,
Mann-Whitney U,

(Instrumental)

There is a formula,
That uses the summed ranks,
A distribution’s what you, compare the answer to,
Mann-Whitney U

UPDATE 11/10/12: Here are some notes I wrote on the board regarding the Central Limit Theorem (thanks to HB for photo)...


---
*According to King, Rosopa, and Minium (2011), "Many people call chi-square a nonparametric test, but it does in fact assume the central limit theorem..." (p. 382).