Lesson 3: Measures of Center, Percentiles and Boxplots
August 27, 2024
Review:
- Attendance
- Visualizing the shape of the distribution
- Stemplots
- Histograms
- Against All Odds Video – Unit 3
- Demonstrate spreadsheet histogram construction
- Histogram ≠ Column Chart
- UN health and economy (Sheets)
- Cryptogram with frequency distribution – anyone solve?
Presentation:
- Measures of Center
- Mean
- Same as Average
- Sum of values divided by count, or ∑x/n
- n = total number of observations/measurements/values
- Example
- Median
- Center value within ordered sequence of values
- Same as 50th Percentile
- Position = 0.50*n
- n = total number of observations
- round to the nearest whole number
- alternative for small (n<30) data sets = (n+1)/2
- Example
- Video: Measures of Center
- Mean
- Percentiles and Boxplots
- Percentiles
- To find the xth percentile
- calculate x/100 * n (rounded to the nearest integer or take average of two values on either side)
- result is the position of the value in an ordered (smallest to largest) data set
- 25th percentile = 25/100 * n
- 50th percentile = 50/100 * n = Median
- 75th percentile = 75/100 * n
- Example with Female height data
-
- Boxplots
- Five number summary
- 25th Percentile = Q1
- 50th Percentile = Q2 = Median
- 75th Percentile = Q3
- Minimum value
- Maximum value
- Range = Max – Min
- Interquartile Range (IQR) = Q3 – Q1
- Example with Male height data
- Video: Boxplots
Activity:
Problem 1.
Here are the starting salaries, in thousands of dollars, offered to 20 students who earned bachelor’s degrees in computer science in 2011.
63 56 66 77 50 53 78 55 90 65 64 69 59 76 48 54 49 68 51 50
a. Make a stemplot.
b. Find the median, mean and mode.
c. Find the five-number summary.
d. Make a boxplot.
e. Compute the range and interquartile range (IQR).
Problem 2.
A consumer testing lab measured calories per hot dog in 20 brands of beef hot dogs. Here are the results:
186 181 176 149 184 190 158 139 175 148 152 111 141 153 190 157 131 149 135 132
a. Make a stemplot.
b. Find the median, mean and mode.
c. Find the five-number summary.
d. Make a boxplot.
e. Compute the range and interquartile range (IQR).
These problem descriptions (#1 and #2) are from “Against All Odds”, modified slightly and copied here for convenience.
Problem 3.
Use the MPG data for top selling midsize cars in the US and European markets from Lesson 1.
a. Find the five-number summaries.
b. Produce 2 boxplots, one for each market, and put them both on the same chart to facilitate comparison.
Assignment:
- For class on Thu
- Use the UN health and economy data (filter to display North and South America only).
- Use Google Sheets (or Excel) to create a histogram for Life Expectancy, Infant Mortality and GDP Per Capita (3 separate histograms)
- Compute the 5-number summary for Life Expectancy, Infant Mortality and GDP Per Capita (1 5-number summary for each variable).
- Be prepared to share your histograms and 5-number summaries.