Qualitative and Limited Dependent Variable Models
There are many things in economics that cannot be meaningfully quantified. How you vote in an election, whether you go to graduate school, whether you work for pay, or what college major you choose has no natural way of being quantified. Each of these expresses a quality or condition that you possess. Models of how these decisions are determined by other variables are called qualitative choice or qualitative variable models.
In a binary choice model, the decision you wish to model has only two possible outcomes. You assign artificial numbers to each outcome so that you can do further analysis. In a binary choice model it is conventional to assign ‘1’ to the variable if it possesses a particular quality or if a condition exists and ‘0’ otherwise. Thus, your dependent variable is
1 if individual i has the quality 0 if not.
The probit statistical model expresses the probability p that yi = 1 as a function of your independent variables.
P[(Vi|Xi2, Жіз) = 1] = Ф(ві + в2Хі2 + взХіз) (16.1)
where Ф is the cumulative normal probability distribution (cdf). The argument inside Ф is linear in the parameters and is called the index function. Ф maps values of the index function to the 0 to 1 interval. Estimating this model using maximum likelihood is very simple since the MLE of the probit model is already programmed into gretl.
The syntax for a script is the same as for linear regression except you use the probit command in place of ols. The following script estimates how the difference in travel time between bus and
1 open "@gretldirdatapoetransport. gdt"
2 probit auto const dtime
3 scalar i_20 = $coeff(const)+$coeff(dtime)*20
4 scalar p_30 = cnorm($coeff(const)+$coeff(dtime)*30)
The third line computes the predicted value of the index (ві + fadtime) when dtime = 20 using the estimates from the probit MLE. The last line computes the estimated probability of driving, given that it takes 30 minutes longer to ride the bus. This computation requires cnorm, which computes the cumulative normal cdf, Ф().
The results are:
Probit estimates using the 21 observations 1-21
Dependent variable: auto
Coefficient Std. Error z-stat Slope*
const -0.0644342 0.399244 -0.1614 .
dtime 0.0299989 0.0102867 2.9163 0.0119068
Mean dependent var 0.476190 S. D. dependent var 0.396907
McFadden R2 0.575761 Adjusted R2 0.438136
Log-likelihood -6.165158 Akaike criterion 16.33032
Schwarz criterion 18.41936 Hannan-Quinn 16.78369
* Evaluated at the mean
Number of cases ‘correctly predicted’ = 19 (90.5 percent) Likelihood ratio test: %2(1) = 16.734 [0.0000]
i_20 = 0.535545
p_30 = 0.798292
Of course, you can also access the probit estimator from the pull-down menus using Model>Nonlinear models>Probit>Binary. The dialog box (Figure 16.1) looks very similar to the one for linear regression, except it gives you some additional options e. g., to view the details of the iterations.
Several other statistics are computed. They include a measure of fit (McFadden’s pseudo-R2), the value of the standard normal pdf ф(втx) at mean of independent variables, and a test statistic for the null hypothesis that the coefficients on the independent variables (but not the constant) are jointly zero; this corresponds to the overall F-statistic of regression significance in chapter 6.