## Lesson 8: Design of Experiments, Sampling and Surveys

September 25, 2017

Review:

- Linear Regression
- Correlation

Presentation:

**Design of Experiments***Explanatory*variables*Response*variable*Subjects*– objects or participants in experiment*Treatments*– experimental conditions

- Example
- Explanatory variables
- Exam 1 Part 1 begin day/time
- Hours of sleep night before exam
- Study hours
- Homework completion

- Response variable:
- Exam 1 Score

- Subjects
- Students

- Treatments
- Open book vs Closed book
- Written vs Multiple Choice
- Morning vs Afternoon

- Explanatory variables
- Experimental Design Principles
- Comparison of Treatments
- Placebo effect
- Control group

- Randomize assignments
- Bias
- Double-blind

- Repetition
- Limit impact of individual observations
- Confirm hypotheses

- Comparison of Treatments
- Design of Experiments video

**Census and Sampling**- Census = attempt to count entire population
- US Census every 10 years
- Most costly non-military federal government operation (except maybe bailing out banks)

- Sample = gather info from a portion of the population
- Sample Types
- Voluntary response sample
- Complete an optional survey
- Inherent bias
- Example: Google Survey

- Simple random sample
- select from a population
- each individual has equal chance of being selected
- Example: random selection of subset of enrolled students

- Stratified random sample
- divide population into groups or
*strata* - random sampling, equal chance of selection within each group
- Example: random selection within major (30% CIS, 35% Mgmt, 35% Econ)

- divide population into groups or

- Voluntary response sample

- Sample Types
- Census and Sampling Video

- Census = attempt to count entire population

**Samples and Surveys**- Toward Statistical Inference
- Population parameters
- Sample statistics
- Sampling distribution
- Bias and Variability
- to reduce bias, use random sampling
- to reduce variability, use larger sample sizes (p. 215)
- variability determines margin of error
- see illustration on p. 218

- video

Assignment:

An automotive parts manufacturer is planning to launch a new product but it only works on Honda vehicles. The company has hired you to estimate the size of the market for their new product in Colorado. You are provided the following data, including the population (in thousands) and the total number of registered vehicles (in thousands) for the 10 largest counties in the State.

County | Population | Vehicles |

Denver | 693 | 579 |

El Paso | 688 | 599 |

Arapahoe | 637 | 561 |

Jefferson | 572 | 529 |

Adams | 498 | 444 |

Larimer | 340 | 303 |

Douglas | 329 | 288 |

Boulder | 322 | 261 |

Weld | 295 | 279 |

Pueblo | 165 | 148 |

**Part A.**Use linear regression to estimate the total vehicle population in Colorado by generating a regression equation to estimate Vehicles (y) using Population (x). What additional information will you need to estimate the vehicle population for all of Colorado?**Part B.**Design a survey to estimate the percentage of Honda vehicles in Colorado. What are some possible sources of bias and variability in estimating % Honda ownership?**Part C.**Use your results in Part 1 and Part 2 to generate a numerical estimate of the size of the Colorado market. Explain how you calculated your estimate and what, if any, assumptions were made.

Study: