Wednesday, October 22, 2008

t-Test Overview

(Updated October 11, 2025)

We will now be covering t-tests (for comparing the means of two groups) for the next week or so. As we'll discuss, there are two ways to design studies for a t-test:

INDEPENDENT SAMPLES, where a participant in one group (e.g., Trump voters in the 2024 election) cannot be in the other group (Harris voters). The technical term is that the groups are "mutually exclusive." The Trump and Harris voters could be compared, for example, on their average income.

PAIRED/CORRELATED GROUPS, where the same (or matched) person(s) can serve in both groups. For example, the same participant could be asked to complete math problems both during a period where loud hard-rock music is played and during a period where quiet, soothing music is played. Or, if you were comparing men and women on some attitude measure and your participants were heterosexual married couples, that would be considered a correlated design.

The Naked Statistics book briefly discusses the formula for an independent-samples t-test on pp. 164-165. Here's a simplified graphic I found from the web (original source):

Notice from the "Xbar1 - Xbar2" portion that the t statistic is gauging the amount of difference between the two means, in the context of the respective groups' standard deviations (s) and sample sizes (n). Your obtained t value will be compared to the t distribution (which is similar to the normal z distribution) to see if the t value is extreme enough to be unlikely to stem from chance. You will also need to take account of "degrees of freedom," which for an independent-samples t-test are closely based on total sample size.

There's an online graphic that visually illustrates the difference between z (normal) and t distributions (link). As noted on this page from Columbia University, "tails of the t-distribution are thicker and extend out further than those of the Z distribution. This indicates that for a given confidence level, t-scores [needed for significance] are larger than Z scores." (Remember our earlier term for distributions' tendency to have thick tails and generate outliers... kurtosis.)

More technically, as Westfall and Henning (2013) point out, "Compared to the standard normal distribution, the t-distribution has the same median (0.0) but with variance df/(df-2), which is larger than the standard normal's variance of 1.0" (p. 423). Remember that the variance is just the standard deviation squared.

In this table are shown values your obtained t statistic needs to exceed (known as "critical values") for statistical significance, depending on your df and target significance level (typically p < .05, two-tailed). 

This website provides a nice overview of one- and two-tailed tests. One-tailed tests are appropriate when there is a directional hypothesis (i.e., among students with no prior calculus instruction, those who receive calculus instruction during a summer workshop will score higher, on average, on a calculus post-test than will students who did not receive a summer calculus workshop, with the opposite prediction making no sense). Despite one-tailed tests seeming to be the best choice in some situations, however, two-tailed tests are nearly always used, presumably because they are more conservative (i.e., harder to obtain significance with). This 2024 article argues for greater use of one-tailed tests.

I have created a little tutorial on how to interpret SPSS output for independent-samples t-tests.

Later, we will take up the paired/correlated/dependent samples t-test at this link.

To end this part of the lesson, let's have a song! I have not written any lyrics for t-tests but, fortunately, Dr. Jeff Witmer of Oberlin College did in 2005 ("Use a t," which may be sung to the tune of "Let it Be," Lennon-McCartney). I attended the first-ever U.S. Conference on Teaching Statistics (USCOTS) in 2005 at The Ohio State University. When I walked into the opening reception of this meeting, Jeff was on stage singing "Use a t." It was my first exposure to statistical lyrics, which inspired me to write a few of my own over the years. 

The Consortium for the Advancement of Undergraduate Statistics Education (CAUSE), which sponsors the USCOTS meetings, several years ago commissioned musicians to record many statistical songs written by CAUSE/USCOTS participants, including "Use a t." Let's have a class sing-along! (click here and then search on Witmer). The references in the song to William Gosset and "Student" are explained in this article. When I saw Dr. Witmer at this past summer's (2025) 20th anniversary USCOTS meeting at Iowa State University, I absolutely had to ask for a selfie with him! Here it is...