Lesson 8: Categorical Data and Indicator Variables
February 27, 2024
Review:
- Multicollinearity
- Interaction Variables
- Used vehicle multiple regression model
Presentation:
- Pivot tables to aggregate by category
- Indicator Variables (aka “dummy” variables)
- incorporate categorical variables
- binary – value is either 1 or 0
- if 0, then the coefficient has no effect on y-hat
- if 1, then the coefficient is added to (or subtracted from) the estimate
- Demonstration with Sample Real Estate data
- Use a Pivot table to summarize price by neighborhood
- create a dummy variable for a neighborhood
- create a dummy variable for multiple neighborhoods
Activity:
- Use your used car model (from last lesson)
- Incorporate an Indicator variable to your used car model (from last lesson)
- Can you improve R^2 and/or F with an indicator variable?
Assignment:
- Download recent Singapore real estate transaction data
- Build a multiple regression model to estimate selling price.
- Incorporate one or more Indicator variables to include categorical data elements