Lesson 9: Two-Way Tables and Causation
September 27, 2017
- Design of Experiments
- Census and Sampling
- Samples and Surveys
- Oops, I went out of order
- Late submissions for Lesson 1-5 have been graded
- Colorado Honda market size
- Two-Way Tables
- Categorical data/variables
- Nominal (e.g., gender, eye color, race, etc)
- Ordinal (i.e., ranking or sequence; e.g., 1st, 2nd, 3rd or freshman, sophomore, junior, senior)
- Marginal distribution = (row or column total)/(grand total)
- Joint distribution = (cell entry)/(grand total)
- Conditional distribution = (cell entry)/(row or column total)
- Example Two-Way Tables (student paid/unpaid work hours per week):
- Marginal distribution example: % students who work more than 30 hours = 6/50 = 12.0%
- Joint distribution example: % of all students who are female and don’t work = 10/50 = 20.0%
- Conditional distribution example: % of male students who work 11-20 hours = 7/21 = 33.3%
- Categorical data/variables
- The Question of Causation
- Correlation does not imply causation
- Spurious Correlations
- Lurking variables = explanatory variables not (yet) included in analysis
- Retrospective study = looking back to find possible causes for an established outcome among a sample population
- Prospective studies = following a sample population over time and studying behaviors possibly linked to likelihood of an outcome
Refer to the table below for questions 1-3.
1. Add a row to the bottom and a column to the right-end of your table. Compute the marginal totals and enter them into your table.
2. What percentage of the students who answered both questions were male? Female? Show your calculations.
3. What percentage of the students rated their intelligence as Above Average?
4. Members of a high school language club believe that study of a foreign language improves a student’s command of English. From school records, they obtain the scores on an English achievement test given to all seniors. The average score of seniors who had studied a foreign language for at least two years is much higher than the average score of seniors who studied no foreign language. The club’s advisor says that these data are not good evidence that language study strengthens English skills. Explain what lurking variables prevent the conclusion that language study improves students’ English scores.
5. Recent studies have shown that earlier reports seriously underestimated the health risks (such as heart disease) associated with being overweight. The error was caused by overlooking important lurking variables. In particular, smoking tends both to reduce weight and to lead to health problems such as heart disease. (a) Describe how you would do a retrospective study of the link between being overweight and having heart problems. (b) Describe how you would do a prospective study of this same link.
All questions copied for convenience from the Against All Odds Unit 13 & 14 student guides.