Lesson 19: Pearson Correlation Coefficient and R-Squared
April 8, 2026
Review:
- Linear Regression
Presentation:
- Pearson Correlation Coefficient
- r = SSxy/√(SSxx*SSyy)
- -1 < r < 1
- R-Squared
- Calculate r and then square the result, i.e., =r^2
- 0 < r^2 < 1
- Let’s solve a new problem from beginning to end.
Demonstration Problem. A street vendor sells hot chocolates on an outdoor pedestrian mall. Sales tend to be strong on cold days but weak on warmer days. Calculate the Sum of Squares, Linear Regression, Pearson Correlation Coefficient and R-Squared.
| Day | Temp (x) | Hot Chocolates Sold (y) |
|---|---|---|
| 1 | 30 | 85 |
| 2 | 35 | 78 |
| 3 | 40 | 82 |
| 4 | 45 | 70 |
| 5 | 50 | 74 |
| 6 | 55 | 66 |
| 7 | 60 | 63 |
| 8 | 65 | 68 |
Use the resulting y^ equation, estimate hot chocolate sales assuming 32 degrees and 64 degrees, respectively.
How does this compare to the linear model for estimating Iced Coffee sales? Which would be more accurate for sales forecasting? Why?
Activity:
Now, here’s some real data, gathered from USDA National Water and Climate Center and the USGS Water Data Center. The snowpack data (snow-water equivalent inches) is for the Freemont Pass location. The streamflow data (ft^3/SEC) is for the Arkansas River station in Pueblo.
| Water Year | Snowpack (x) | Streamflow (y) |
| 2018 | 20.4 | 577.5 |
| 2019 | 28.1 | 1599.3 |
| 2020 | 20.2 | 901.0 |
| 2021 | 15.5 | 659.4 |
| 2022 | 17.6 | 730.4 |
| 2023 | 17.1 | 1016.0 |
| 2024 | 22.2 | 1455.7 |
| 2025 | 18.2 | 817.9 |
Calculate Sum of Squares, the Linear Equation, the Pearson Correlation Coefficient and R^2. If it helps, you may divide the streamflow numbers by 100 so the squared values aren’t too ridiculously large.
Using the regression equation, estimate Streamflow for 2026 assuming Snowpack is only 8.6 (this year is the worst year for Snowpack in Colorado history).