Monday, November 12, 2007

Non-Parametric/Assumption-Free Statistics

(Updated November 11, 2013)

This week, we'll be covering non-parametric (or assumption-free) statistical tests (brief overview). Parametric techniques, which include the correlation r and the t-test, refer to the use of sample statistics to estimate population parameters (e.g., rho, mu).* Thus far, we've come across a number of assumptions that technically are required to be met for doing parametric analyses, although in practice there's some leeway in meeting the assumptions.

Assumptions for parametric analyses are as follows (for further information, see here):

o Data for a given variable are normally distributed in the population.

o Equal-interval measurement.

o Random sampling is used.

o Homogeneity of variance between groups (for t-test).

One would generally opt for a non-parametric test when there's violation of one or more of the above assumptions and sample size is small. According to King, Rosopa, and Minium (2011), "...the problem of violation of assumptions is of great concern when sample size is small (< 25)" (p. 382).

In other words, if assumptions are violated but sample size is large, you still may be able to use parametric techniques. We'll be doing a neat demonstration that conveys the role of large samples in salvaging data from the normal-distribution assumption, under what is known as the Central Limit Theorem (see further examples from this website).

To a large extent, the different non-parametric techniques represent analogues to parametric techniques. For example, the non-parametric Mann-Whitney U test is analogous to the parametric t-test, when comparing data from two independent groups, and the non-parametric Wilcoxon signed-ranks test is analogous to a repeated-measures t-test. This PowerPoint slideshow demonstrates these two non-parametric techniques. I've even written a song for the occasion...

Mann-Whitney U
Lyrics by Alan Reifman
(May be sung to the tune of “Suzie Q.,” Hawkins/Lewis, covered by John Fogerty)

Mann-Whitney U,
When your groups are two,
If your scaling’s suspect, and your cases are few,
Mann-Whitney U,

The cases are laid out,
Converted to rank scores,
You then add these up, done within each group,
Mann-Whitney U,

(Instrumental)

There is a formula,
That uses the summed ranks,
A distribution’s what you, compare the answer to,
Mann-Whitney U

UPDATE 11/10/12: Here are some notes I wrote on the board regarding the Central Limit Theorem (thanks to HB for photo)...


---
*According to King, Rosopa, and Minium (2011), "Many people call chi-square a nonparametric test, but it does in fact assume the central limit theorem..." (p. 382).