Lesson 16: The t-distribution for Small Sample Inference
October 30, 2017
Review:
- Hypothesis Testing
- Exam 3 on Wed, Nov 15
Presentation:
- t-distribution
- For larger samples, n ≥ 30 use Z-table
- For small samples, n < 30 use t-distribution
- Only impacts calculation of the test statistic
- margin of error for confidence intervals
- test statistic and p-value for significance testing
- t distribution critical values
- print this for the next exam and highlight/use only 2 columns
- use the “0.05” column for 1-tail tests at α = 0.05
- use the “0.025” column for 2-tail tests at α = 0.05
- select the row corresponding to df = n – 1
- df = degrees of freedom
- n = number of records in sample data
- print this for the next exam and highlight/use only 2 columns
- t-statistic formulas
- for confidence intervals
- x-bar ± t * s/√n
- instead of the Z equivalent: x-bar ± z * σ/√n
- for significance testing
- t = (x-bar − µ)/(s/√n)
- instead of Z equivalent: Z = (x-bar − µ)/(σ/√n)
- for confidence intervals
- Video
Confidence Interval Example
The weights (in lbs) from a random sample of n=16 four-year-old children who took part in a study on childhood obesity are as follows: {37.1, 26.7, 36.1, 36.2, 40.3, 43.9, 36.2, 40.7, 42.5, 34.8, 37.9, 34.5, 31.1, 36.4, 35.7, 33.4}
From these data, we can compute the sample mean and standard deviation: x-bar = 36.47 lbs and s = 4.23 lbs.
The population standard deviation σ is unknown. Nevertheless, we would like to calculate a confidence interval for µ, the mean weight of four-year-olds. So, for our sample of n=16 observations, df = 15. The value of t* for a 95% confidence interval can be determined from a t-table, t* = 2.131.
We now have everything that we need to calculate a 95% confidence interval for µ, the mean weight of 4-year-olds:
36.47 ± (2.131)*(4.23/√16) = 36.47 ± 2.25 or {34.22, 38.72}
Significance Testing Example
The heights of the same sample of n=16 4-year-olds are given as follows: {39.9, 37.4, 40.3, 39.6, 39.2, 43.2, 40.5, 40.6, 41.8, 39.5, 40.9, 39.8, 40.3, 39.4, 40.7, 39.5}
A height chart lists the average height for 4-year-olds as 39 inches but we suspect children’s heights have increased since the time the height chart was created due in part to better nutrition. To test our supposition, we let µ represent the mean height of 4-year-olds and setup the null and alternative hypotheses as follows:
- Ho: μ = 39
- Ha: μ > 39
The sample mean and standard deviation for the height data are: x-bar = 40.163 inches and s = 1.255 inches. Now we can calculate the t-test statistic, substituting sample values for x-bar and s.
t = (40.163 – 39)/(1.255/√16) = 3.71
For a 1-tail test at α=0.05 and df=15, the critical value is 1.753. Since 3.71 is far greater than 1.753 we can safely Reject Ho at α=0.05 and conclude mean heights of 4-year-olds has increased since the chart was created.
Assignment:
Problem 1. Given a simple random sample of size n, you want to compute a confidence interval for µ of the form: x ± t *(s/√n) . Find the value of t* for each of the following confidence levels and sample sizes.
- a. 95% confidence interval for a sample size n = 12.
- b. 95% confidence interval for a sample size n = 25.
- c. 95% confidence interval for a sample size n = 40.
Problem 2. To conduct a significance test for a small sample you need to find the appropriate critical value on the t distribution table. Find the value of t* for each of the following test scenarios.
- a. 95% confidence for a 1-tail test of mean with n=20
- b. 95% confidence for a 2-tail test of mean with n=30
Problem 3. Supermarket rotisserie chickens have become very popular with the American public. A study was conducted to compare the nutrient composition of commercially-prepared rotisserie chicken to that of roasted chicken, which is listed in the USDA National Nutrient Database for Standard Reference (SR).
- a. The SR listed the mean protein content of roasted chicken breast as 31 grams. In the sample of n=9 rotisserie chickens, x-bar = 29.86 grams and s = 1.95 grams. Conduct a t-test to see if the mean protein content in rotisserie chicken breasts differs from the SR. Report the value of the test statistic, the critical value, and your conclusion.
- b. The SR listed the mean cholesterol level in roasted chicken thighs as 95 milligrams. In a sample of n=9 rotisserie chicken thighs, x-bar = 134 milligrams and s = 2.43 milligrams. Conduct a t-test to see if the mean cholesterol level in rotisserie chicken thighs is significantly higher than the SR. Report the value of the test statistic, the critical value, and your conclusion.
Problem 4. After purchasing a new fuel-efficient vehicle, the owner calculates miles per gallon (mpg) the first 12 times he fills the fuel tank. Here are the results of his calculations: {42, 43, 37, 36, 34, 45, 48, 43, 38, 42, 43, 46}
- a. Create a stemplot to check the shape of the distribution. Do these data appear to be normally distributed?
- b. Find the mean and standard deviation for the mpg sample data.
- c. Calculate the margin of error for a 95% confidence interval to estimate fuel efficiency.
- d. The manufacturer claims the vehicle gets 38 mpg. Are these results consistent with the manufacturer’s claim? Conduct a hypothesis test to find if these results indicate the vehicle gets significantly higher mpg at α=0.05.
These examples and problems are from the Against All Odds video series and Introduction to the Practice of Statistics (6th Ed.) by Moore et al.