+ - 0:00:00
Notes for current slide
Notes for next slide

STA 506 2.0 Linear Regression Analysis

Lecture 12-ii: Indicator Variables

Dr Thiyanga S. Talagala

1 / 10

Introduction

  • All the independent variables we have considered up to this point have been measured on a continuous scale.

  • Regression analysis can be generalized to incorporate qualitative variables.

  • Examples of qualitative variables:

    • Gender (male, female)
    • Smoking status (smoker, nonsmoker)
    • Employment status (full-time, part-time, unemployed)
    • BMI (underweight, normal, overweight, obese)
  • Categorical variables can be incorporated into regression through indicator variables.

  • Sometimes indicator variables are called dummy variable.

2 / 10

Creating dummy variables

IQ Gender BMI
1 10 Male 20.2
2 20 Male 20.5
3 100 Male 18.5
4 98 Male 25.0
5 100 Female 24.9
6 11 Female 31.0
7 50 Female 18.5
8 70 Female 20.0

Indicator variable for Gender

Di={1if male0if female

The choice of 0 and 1 to identify the levels of a qualitative variable is arbitrary.

3 / 10

Regression equation

IQ Gender BMI D
1 10 Male 20.2 1
2 20 Male 20.5 1
3 100 Male 18.5 1
4 98 Male 25.0 1
5 100 Female 24.9 0
6 11 Female 31.0 0
7 50 Female 18.5 0
8 70 Female 20.0 0

Indicator variable for Gender

Di={1if male0if female

Regression equation

yi=β0+β1xi+β2Di+ϵi, where x represents the variable BMI.

4 / 10

Regression equation (cont.)

yi=β0+β1xi+β2Di+ϵi,

where x represents the variable BMI.

Regression equation for males, Di=1

yi=β0+β1xi+β2+ϵi,

yi=(β0+β2)+β1xi+ϵi,

  • Thus the relationship between IQ score and BMI for males is a straight line with intercept β0+β2 and slope β1.

Regression equation for females, Di=0

yi=β0+β1xi+ϵi,

  • Thus the relationship between IQ score and BMI for females is a straight line with intercept β0 and slope β1.
5 / 10

Regression equation (cont.)

Regression equation for males, Di=1

yi=(β0+β2)+β1xi+ϵi,

Regression equation for females, Di=0

yi=β0+β1xi+ϵi,

height=1

6 / 10

  • Two parallel lines (a common slope and different intercepts).

  • β2 expresses differences in heights between the two regression lines.

Interpretations

  • β1 - Change in the mean response, μY, for each additional unit increase in the BMI when other variables held at constant.

  • β2 is a measure of how much higher (or lower) the mean response of the male group is than that of the female group.

( β2 is a measure of the difference in mean IQ score resulting from changing from male group to female group)

7 / 10

Qualitative variable with more than 2 levels

In general, a qualitative variable with k levels is represented by k1 indicator variables, each taking the values 0 and 1.

IQ BMI headcir D1 D2
1 10 Normal 50.2 0 1
2 20 Normal 50.5 0 1
3 100 Obese 58.5 0 0
4 98 Obese 55.0 0 0
5 100 Underweight 54.9 1 0
6 11 Underweight 40.0 1 0
7 50 Underweight 48.5 1 0
8 70 Underweight 50.0 1 0
D1 D2 Description
1 0 observation is from underweight
0 1 observation is from normal
0 0 observation is from Obese

D1i={1if underweight0if otherwise

D2i={1if normal0if otherwise

8 / 10

Your turn

Write the regression equations for the three levels.

IQ headcir D1 D2
1 10 50.2 0 1
2 20 50.5 0 1
3 100 58.5 0 0
4 98 55.0 0 0
5 100 54.9 1 0
6 11 40.0 1 0
7 50 48.5 1 0
8 70 50.0 1 0
9 / 10

Acknowledgement

Introduction to Linear Regression Analysis, Douglas C. Montgomery, Elizabeth A. Peck, G. Geoffrey Vining

All rights reserved by

Dr. Thiyanga S. Talagala

10 / 10

Introduction

  • All the independent variables we have considered up to this point have been measured on a continuous scale.

  • Regression analysis can be generalized to incorporate qualitative variables.

  • Examples of qualitative variables:

    • Gender (male, female)
    • Smoking status (smoker, nonsmoker)
    • Employment status (full-time, part-time, unemployed)
    • BMI (underweight, normal, overweight, obese)
  • Categorical variables can be incorporated into regression through indicator variables.

  • Sometimes indicator variables are called dummy variable.

2 / 10
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow