As we finish up descriptive statistics (i.e., ways to describe distributions of data), I wanted to provide a conceptual scheme for those of you who want more structure to organize the information.
In my little example of how often people go to the movies, we saw how two distributions could have the same mean, but the distributions still had pretty different shapes (both were symmetrical, but one was more spread out than the other). The means don't tell the distributions apart, so we have to look at a second feature, the standard deviation, to tell them apart.
You can also have two distributions that have the same mean and same standard deviation, yet are still not identical-looking. The following example from the financial world illustrates just such a case:
The blue and green distributions are clearly different. The distributions are not distinguished by their means (they're the same), nor are the distributions distinguished by their standard deviations (they're also the same). We have to look at a third feature of the distributions, their skewness, to tell them apart.
The logic that I hoped you've seized upon has to do with the mathematical concept of moments:
The mean is the first moment.
The standard deviation (or variance, which is SD squared) is the second moment.
The skewness is the third moment.
The kurtosis is the fourth moment.
The textbook from when I took first-year graduate statistics in the psychology department at Michigan during the 1984-85 academic year (William L. Hays, Statistics, 3rd ed.) nicely summarizes moments, in my view:
Just as the mean describes the "location" of the distribution on the X axis, and the variance describes its dispersion, so do the higher moments reflect other features of the distribution. For example, the third moment about the mean is used in certain measures of degree of skewness... The fourth moment indicates the degree of "peakedness" or kurtosis of the distribution, and so on. These higher moments have relatively little use in elementary applications of statistics, but they are important for mathematical statisticians... The entire set of moments for a distribution will ordinarily determine the distribution exactly... (p. 167)
UPDATE (2014): Peter Westfall (TTU business professor) and Kevin Henning, in their book Understanding Advanced Statistical Methods, argue that the key feature of kurtosis is not a distribution's peakedness, but rather its propensity to outliers. This formula determines whether kurtosis is positive or negative. Westfall and Henning note that "if the kurtosis is greater than zero, then the distribution is more outlier-prone than the normal distribution; and if the kurtosis is less than zero, then the distribution is less outlier-prone than the normal distribution" (pp. 250-251).