Qualitative and Limited Dependent Variable Models

16.1 Probit

There are many things in economics that cannot be meaningfully quantified. How you vote in an election, whether you go to graduate school, whether you work for pay, or what college major you choose has no natural way of being quantified. Each of these expresses a quality or condition that you possess. Models of how these decisions are determined by other variables are called qualitative choice or qualitative variable models.

In a binary choice model, the decision you wish to model has only two possible outcomes. You assign artificial numbers to each outcome so that you can do further analysis. In a binary choice model it is conventional to assign ‘1’ to the variable if it possesses a particular quality or if a condition exists and ‘0’ otherwise. Thus, your dependent variable is

1 if individual i has the quality 0 if not.

The probit statistical model expresses the probability p that yi = 1 as a function of your indepen­dent variables.

P[(Vi|Xi2, Жіз) = 1] = Ф(ві + в2Хі2 + взХіз) (16.1)

where Ф is the cumulative normal probability distribution (cdf). The argument inside Ф is linear in the parameters and is called the index function. Ф maps values of the index function to the 0 to 1 interval. Estimating this model using maximum likelihood is very simple since the MLE of the probit model is already programmed into gretl.

The syntax for a script is the same as for linear regression except you use the probit command in place of ols. The following script estimates how the difference in travel time between bus and

auto affects the probability of driving a car. The dependent variable (auto) is equal to 1 if travel is by car, and dtime is (bus time – auto time).

1 open "@gretldirdatapoetransport. gdt"

2 probit auto const dtime

3 scalar i_20 = $coeff(const)+$coeff(dtime)*20

4 scalar p_30 = cnorm($coeff(const)+$coeff(dtime)*30)

The third line computes the predicted value of the index (ві + fadtime) when dtime = 20 using the estimates from the probit MLE. The last line computes the estimated probability of driving, given that it takes 30 minutes longer to ride the bus. This computation requires cnorm, which computes the cumulative normal cdf, Ф().

The results are:

Probit estimates using the 21 observations 1-21
Dependent variable: auto

Coefficient Std. Error z-stat Slope*

const -0.0644342 0.399244 -0.1614 .

dtime 0.0299989 0.0102867 2.9163 0.0119068

Mean dependent var 0.476190 S. D. dependent var 0.396907

McFadden R2 0.575761 Adjusted R2 0.438136

Log-likelihood -6.165158 Akaike criterion 16.33032

Schwarz criterion 18.41936 Hannan-Quinn 16.78369

* Evaluated at the mean

Number of cases ‘correctly predicted’ = 19 (90.5 percent) Likelihood ratio test: %2(1) = 16.734 [0.0000]

i_20 = 0.535545

p_30 = 0.798292

Of course, you can also access the probit estimator from the pull-down menus using Model>Nonlinear models>Probit>Binary. The dialog box (Figure 16.1) looks very similar to the one for linear re­gression, except it gives you some additional options e. g., to view the details of the iterations.

Figure 16.1: Use Model>Nonlinear models>Probit>Binary to open the Probit model’s dialog box.

Several other statistics are computed. They include a measure of fit (McFadden’s pseudo-R2), the value of the standard normal pdf ф(втx) at mean of independent variables, and a test statistic for the null hypothesis that the coefficients on the independent variables (but not the constant) are jointly zero; this corresponds to the overall F-statistic of regression significance in chapter 6.

Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>