Linear Probability
A linear probability model is a linear regression in which the dependent variable is an indicator variable. The model is estimated by least squares.
E[yi] =1 x Pr(yi = 1) + 0 x Pr(yi = 0) = пі
Thus, the mean of a binary random variable can be interpreted as a probability; it is the probability that y = 1. When the regression E[y^Xi2, Xi3,…, XiK] is linear then E[yi] = ві + в2Xi2 +.. .+вкXiK and the mean (probability) is modeled linearly.
E[yiXi2,Xi3, .. . , XiK] = Пі = ві + e2Xi2 + … + вкXiK (7.7)
The variance of a binanry random variable is
var[yi] = пі(1 – пі) (7.8)
which means that it will be different for each individual. Replacing the unobserved probability, E(yi), with the observed indicator variable requires adding an error to the model that we can estimate via least squares. In this following example we have 1140 observations from individuals who purchased Coke or Pepsi. The dependent variable takes the value of 1 if the person buys Coke and 0 if Pepsi. These depend on the ratio of the prices, pratio, and two indicator variables, disp_coke and disp_pepsi. These indicate whether the store selling the drinks had promotional displays of Coke or Pepsi at the time of purchase.
OLS, using observations 11140
Dependent variable: coke
Heteroskedasticityrobust standard errors, variant HC3
Coefficient 
Std. Error 
tratio 
pvalue 

const 
0.8902 
0.0656 
13.56 
5.88e039 
pratio 
0.4009 
0.0607 
6.60 
6.26e011 
disp_coke 
0.0772 
0.0340 
2.27 
0.0235 
disp_pepsi 
0.1657 
0.0345 
4.81 
1.74e006 
Leave a reply