Lesson 2: Correlation
January 26, 2024
Review:
- Course introduction and syllabus
- Visualize bivariate relationships with scatterplots
- Characterize correlation with visual scatterplot (data cloud) patterns
- Did you identify a dataset to share?
Presentation:
- Calculate Sum of Squares statistics
- Pearson Correlation Coefficient
- r = SSxy/√(SSxx*SSyy)
- -1 < r < 1
- Demonstrate calculation with “Beer Party” data
- {(20,10), (24,12), (28,20), (32,40)}
- Demonstrate how to do the same calculation in Sheets
- Create Scatterplot
- Demonstrate Trendline and r^2 features
- Take square root of r^2 and compare with Pearson Correlation Coefficient
- Demonstrate use of XL Miner ToolPak
- Generate a matrix of correlation coefficients for multiple variables
- Use the Pueblo Real Estate Data Sample
Activity:
- Calculate Pearson Correlation Coefficient (use the Femur and Humerus data below)
- By hand
- Using Sheets
- Plot data on a scatterplot
- Use trendline feature to check calculation
Femur (cm) | Humerus (cm) |
38 | 41 |
56 | 63 |
59 | 70 |
64 | 72 |
74 | 84 |
- Repeat (except not the manual calculations) using US County Demographic data
Assignment:
- Use your dataset to run a correlation analysis (generate a correlation matrix)
- Which relationships between variables are the most compelling?