Sunday, August 18, 2024

Fall 2024 HDFS 5349 Quantitative Methods I

Welcome to QM I, the department's introductory statistics class. I'm a bit unusual in using a blog to organize class materials, but I think it's worked well over the years. Also, for those of you who haven't been in school here, welcome to Texas Tech! You'll be visiting this welcoming page a lot, as it contains the links for our lecture notes.

I'll do my best to provide a lot of practical, real-world exercises in analyzing data, and I'll try to keep things fun. This passage from a book I read several years ago, Coincidences, Chaos, and All That Math Jazz, by Edward B. Burger and Michael Starbird, provides a concise overview of what statistics can offer:

Statistics can help us understand the world. It is a powerful and effective tool for placing economic, social welfare, sports, and health issues into perspective. It molds data into digestible morsels and shows us a measured way to look at situations that have either random or unknown features. But we must use common sense when applying statistics or other tools that draw on our experience of the world to shape data into meaningful conclusions (p. 60).

In addition, the following article sets forth some goals for what you should learn in this class (and other classes). We can access this article via the Texas Tech Library website or Google Scholar.

Utts, J. (2003). What educated citizens should know about statistics and probability. American Statistician, 57(2), 74-79.

LECTURE NOTES (asterisked [*] pages are from my undergraduate research-methods class).

Units of analysis*

Sampling*

Types of Measures*

Visual depictions of a data distribution:

  • Histograms (overview). Your textbook authors King and colleagues (2018) offer some advice on interval widths and the appearance of histograms, noting that "it is customary to make the height of the distribution about  three-quarters of the width" (pp. 32-33). To adjust the width of the columns, click on histogram in your output, then go to "Options," "Un-bin Element," and "Bin Element," then in small square in upper-right of screen, under "X Axis," select "Custom," and then enter desired interval width.
  • Frequency tables contain similar information to histograms. The cumulative percentages also are roughly similar to percentiles (for a given score, you can see what percent of the sample falls below it).
  • Shapes of distributions
  • As a class exercise, we will attempt to reproduce via SPSS this histogram of U.S. Presidents' ages upon assuming office (note that Grover Cleveland, who served two non-consecutive terms is counted as being "two presidents," the 22nd and 24th)

Descriptive statistics:* Central tendency (mean, median, and mode) and spread (standard deviation); moments of a distribution; and z-scores (hereherehere, and here)

Probability (here and here)

Correlation and significance-testing

t-tests

Chi-square

Non-parametric statistics

Statistical power

Confidence intervals

Writing Up Statistical Results in APA Style

Thursday, March 14, 2024

2024 Summer Stats & Methods Courses

My annual list of courses appears below. The list is a work in progress -- I will add programs as I learn of them. For each program, I list the organization (with web link), dates, topic(s), and modality (e.g., online, in person). Programs are listed roughly in the chronological order in which they will take place.

Applied Analyses (Andy Supple), May 21-23, latent class/profile analysis, live online (i.e., synchronous)

CenterStat (Curran-Bauer), 2- to 5-day workshops in May and June, broad offerings, online

University of Michigan (ICPSR), short workshops and four-week courses from May through August, broad offerings, modalities vary

Smart Workshops (formerly Pittsburgh Summer Methodology Series), mostly 2- to 4-day workshops from May to August, broad offerings, online

Global School in Empirical Research Methods (GSERM; University of St. Gallen, Switzerland), May, June, and September, broad offerings, modalities vary

University of Michigan (Survey Research), workshops of varying lengths in June and July, broad offerings, remote instruction

Stats Camp, June 3-7 and 10-14, broad offerings, in Albuquerque, New Mexico

Summer Institute in Computational Social Science -- Usually around two weeks, offered in many locations around the world

Modern Modeling Methods conference (UConn) -- Preconference day, June 24, in person

RECSM (Research Expertise Center for Survey Methodology/Universitat Pompeu Fabra, Barcelona), 4- and 5-day courses in June and July, broad offerings, in person and online

University of Utrecht (Netherlands), numerous statistical, methodological, and substantive courses throughout the social sciences, modalities vary

University of Ljubljana (Slovenia), 1-week courses between July 8-19, broad offerings, online

University of Michigan (Summer Session in Epidemiology), 1- and 3-week workshops from July 8-26, broad offerings, online

Canadian Centre for Research Analysis and Methods (Calgary), 2-day workshops in July, broad offerings, in person (with self-paced online option available)

YEAR-ROUND OFFERINGS

Artisan Analytics

CARMA (Texas Tech)

CILVR (University of Maryland)

Data Orbit

Figure It Out (UK)

InStats

Mplus

QuantFish

Statistical Horizons

Wednesday, December 20, 2023

Formula for a Paired t-Test

A paired t-test incorporates the possibility that the two variables whose means are being compared may also be correlated. In the following example, participating back-pain patients receive actual medication for one stretch of, say, six weeks and placebo (sugar pills) for a different stretch of six weeks. Ideally, the patients and medical staff who directly provide the pills to the patients would not know what kind of pill was being provided (i.e., a double-blind design) and the order of delivery -- medication then placebo or placebo then medication -- would be varied at random.

The focus of the paired t-test is, of course, whether participants' average reported pain while taking actual medication differs from their average reported pain under placebo. However, because patients with the most severe initial pain might report relatively high pain under medication and even higher pain under placebo, whereas patients with the mildest initial pain might report relatively low pain under placebo and even lower pain while receiving medication, patients' pain reports under medication and placebo could be positively correlated (see following graphic).

As shown in the following screenshot from this University of Georgia webpage, the correlation r between the two variables X and Y (highlighted) enters the paired t-test formula for comparing means.


Another document goes into additional depth regarding the paired/correlatedt-test, including implications of the correlation "r" being included in the formulation. As it notes, a larger correlation between the two variables will increase the size of the (absolute) t.

I've also created a graphic to interpret the SPSS output of a paired t-test, emphasizing the t-test comparison of means but also showing where the correlation between the two variables appears.

Friday, March 03, 2023

2023 Summer Stats & Methods Courses

My annual list of courses appears below. The list is a work in progress -- I will add programs as I learn of them. For each program, I list the organization (with web link), dates, topic(s), and modality (e.g., online, in person). Programs are listed roughly in the chronological order in which they will take place.

Oklahoma State University, May 15-16, dyadic analysis, in person

CenterStat (Curran-Bauer), three- and five-day workshops in May and June, broad offerings, online

University of Michigan (ICPSR), short workshops and three-week courses from May through August, broad offerings, modalities vary

Data Orbit, Marketing analytics, data handling and visualization in R, May 25-27; Machine learning for prediction and causal inference in R, June 1–3; both online 

Pittsburgh Summer Methodology Series, 2- to 4-day workshops between June 1-August 9, broad offerings, online

Global School in Empirical Research Methods (GSERM; University of St. Gallen, Switzerland), June 5-23, broad offerings, modalities vary

University of Michigan (Survey Research), 5-day (or longer) workshops between June 5-July 28, broad offerings, modalities vary

Stats Camp, June 5-9 and 12-16, broad offerings, in Albuquerque, New Mexico

Summer Institute in Computational Social Science -- South Florida, June 19-30, in person

Modern Modeling Methods conference (UConn) -- Preconference day, June 26, Mplus and applications, in person

RECSM (Research Expertise Center for Survey Methodology/Universitat Pompeu Fabra, Barcelona), 5-day courses from June 26-July 14, courses to learn statistical packages, analytic techniques, and methodologies; in person and online

University of Utrecht (Netherlands), July 3-14, SEM-related, modalities vary (separate links for each course: herehere, and here)

University of Ljubljana (Slovenia), July 10-21, broad offerings, online

University of Michigan (Summer Session in Epidemiology), 1- and 3-week workshops from July 10-28, broad offerings, modalities vary

Canadian Centre for Research Analysis and Methods, 2-day workshops during latter half of July, broad offerings, in person (with self-paced online option available)

Charite' University of Medicine Berlin/Gender in Medicine, 3-, 5-, and 8-day workshops from July 31-August 9, intensive longitudinal methods and dyadic analysis, hybrid

YEAR-ROUND OFFERINGS

Artisan Analytics

CARMA (Texas Tech)

CILVR (University of Maryland)

Figure It Out (UK)

InStats

Mplus

QuantFish

Statistical Horizons

Friday, October 10, 2014

Writing Up Statistical Results in APA Style

Our colleague Dr. Shera Jackson has compiled the following list of web resources for how to write up statistical results in APA style:

Reporting Statistics in APA Style (Matthew Hesson-McInnis, Illinois State University)

Reporting Results of Common Statistical Tests in APA Format (Psychology Writing Center, University of Washington)

Statistics in APA Style (Craig Wendorf, University of Wisconsin-Stevens Point)

Tuesday, November 16, 2010

Statistical Power (Overview)

This week we'll be covering statistical power (also known as power analysis). Power is not a statistical technique like correlation, t-test, and chi-square. Rather, power involves designing your study (particularly getting a large enough sample size) so that you can use correlations, t-tests, etc., more effectively. The core concept of power, like so much else, goes back to the distinction between the population and a sample. When there truly is a basis in the population for rejecting the null hypothesis (e.g., a non-zero correlation, a non-zero difference between means), we want to increase the likelihood that we reject the null from the analysis of our sample. In other words, we want to be able to pronounce a result significant, when warranted. Here are links to my previous entries on statistical power.

Introductory lecture

Why a powerful design is needed: The population may truly have a non-zero correlation, for example, but due to random sampling error, your sample may not; plus, some songs on statistical power!

Remember that there's also the opposite kind of error: The population truly has absolutely no correlation, but again due to random sampling error, you draw a sample that gives the impression of a non-zero correlation.

How to plan a study using power considerations

Wednesday, November 03, 2010

Chi-Square

My introductory stat notes for methods class have some introductory information on chi-square.

Here are direct links to some old chi-square blog postings. This one discusses the reversibility error and how properly to read an SPSS printout of a chi-square analysis. The other one illustrates the null hypothesis for chi-square analyses in terms of equal pie-charts.

The following photo of the board, containing chi-square tips, was added on November 15, 2011 (thanks to Selen).


Plus a song (added November 1, 2011):

One Degree is Free
Lyrics by Alan Reifman
(May be sung to the tune of “Rock and Roll is Free,” Ben Harper)

Look at your, chi-square table,
If it is, 2-by-2,
One cell can be filled freely,
While the others take their cue,

The formula that you can use,
Come on, from the columns, lose one,
And one, as well, from the rows,
Multiply the two, isn’t this fun?

One degree is free, in your table,
With con-tin-gen-cy, in your table,
One degree is free, in your table,
…free in your table,
…free in your table,

Say, your table is larger,
Maybe it’s 2-by-4,
Multiply one by three,
3 df are in store,

The df’s are essential,
To check significance,
Go to your chi-square table,
And find the right instance,

Three degrees are free, in your table,
With con-tin-gen-cy, in your table,
Three degrees are free, in your table,
…free in your table,
…free in your table,

(Guitar Solo)