Lesson 14: Tests of Significance

October 23, 2017

Review:

Estimation with Confidence Intervals

Presentation:

Significance Testing
- aka Hypothesis Testing
- Purpose: to evaluate data for evidence of significant agreement or disagreement

Step 1. Setup the null hypothesis (Ho) and alternate hypothesis (Ha)
Step 2. Calculate the appropriate test statistic
Step 3. Find the P-value (probability of obtaining result by chance)
Step 4. Interpret results, compare P-value to α = 0.05

Example A: Do Math SAT scores improve significantly with coaching?
- National Math SAT scores are normally distributed with mean score = 505 and std. dev = 62
- Sampled 1,000 students who received coaching
- Sample mean score was 509
- Are these results significantly better than the national average?
- Step 1: Setup the hypothesis test
μ = 505, σ = 62
x̅ = 509, n = 1,000
Ho: μ = 505
Ha: μ > 505
- Step 2: Calculate the appropriate test statistic
  - Z-test = (x̅ – μ)/(σ/√n)
  - (509 – 505)/(62/√1000) = 4/1.96 = 2.04
- Step 3: Find P-value
  - P = 1 – P(Z<2.04) = 1 – 0.9793 = 0.0207
- Step 4: Interpret results
  - 0.0207 ≅ 2.07% probability of getting this result by chance
  - 0.0207 < 0.05
  - Reject Ho
  - Coaching seems to improve scores significantly

Example B: Has a student paper been plagiarized?
- Previous student papers contain 7 unique vocabulary words on average with std dev of 2.6
- Submitted paper contains 10 unique words
- Is the submitted paper significantly different?
- Step 1: Setup the hypothesis test
  - μ = 7, σ = 2.6
    x̅ = 10
    Ho: μ = 7
    Ha: μ > 7
- Step 2: Calculate the appropriate test statistic
  - Z-test = (x̅ – μ)/(σ/√n)
  - (10 – 7)/2.6 = 1.15
- Step 3: Find P-value
  - P = 1 – P(Z<1.15) = 1 – 0.8749 = 0.1251
- Step 4: Interpret results
  - 0.1251 ≅ 12.51% probability of getting result by chance
  - 0.1251 > 0.05
  - Fail to Reject Ho
  - Unique vocabulary is within normal range, no evidence of plagiarism

Significance Testing video

Significance testing is like paternity testing.
- When you check father-child DNA for a match you can prove one person is or is not the father.
- The same test does not prove another person is the father.
- You’re evaluating only one possibility at a time.
- So, when interpreting results, we either “Reject Ho” or “Fail to Reject Ho”

Assignment:

Problem 1.1. More than 200,000 people worldwide take the GMAT examination each year as they apply for MBA programs. Their scores vary Normally with mean about μ = 525 and standard deviation about σ = 100. One hundred students, n = 100, go through a rigorous training program designed to raise their GMAT scores. The students who go through the program have an average score of x̅ = 541.4. Is there evidence to suggest the training program significantly improves GMAT scores?
Problem 1.2. A newly installed rooftop solar system has been producing energy for n = 100 days. Average energy production is 41.8 kWh per day with a standard deviation of 13.9 kWh. The solar panel manufacturer claims the panels typically produce 40 kWh per day. Is the newly installed system producing significantly more energy than estimated by the manufacturer?

* Most example and activity problems presented above are derived from Moore, D.S., McCabe, G.P., and Craig, B.A., 2009. Introduction to the Practice of Statistics, 6th Edition. New York: W.H. Freeman and Company.

Lesson 14: Tests of Significance

Justin

Leave a Reply Cancel reply