# Binary Model

We formally define the univariate binary model by (13.5.1) Р(уі = 1) = Т(хіЗ), і = 1, 2, . . . , n,

where we assume that yt takes the values 1 or 0, T is a known distribution function, X; is a known nonstochastic vector, and P is a vector of unknown parameters.

If, for example, we apply the model to study whether or not a person buys a car in a given year, у, = 1 represents the fact that the ith person buys a car, and the vector x, will include among other factors the price of the car and the person’s income. As in a regression model, however, the хг need not be the original variables such as price and income; they could be functions of the original variables. The assumption that у takes the
values 1 or 0 is made for mathematical convenience. The essential features of the model are unchanged if we choose any other pair of distinct real numbers.

Model (13.5.1) can be derived from the principle of utility maximization as follows. Let Uu and U0i be the zth person’s utilities associated with the alternatives 1 and 0, respectively. We assume that

(13.5.2) Uu = xh0 + Ui and U0l = x^P + щіг

where х1г and x0i are nonstochastic and known, and (щг, u0i) are bivariate i. i.d. random variables, which may be regarded as the omitted inde­pendent variables known to the decision maker but unobservable to the statistician. We assume that the ith person chooses alternative 1 if and only if Uu > U0i. Thus we have

(13.5.3) P(yt = 1) = P(UU > UK)

= P[u0i – Ui < (xH – Xoi)’0].

We obtain model (13.5.1) if we assume that the distribution function of щі — uu is F and define хг- = х1г — х0г.  The following two distribution functions are most frequently used: standard normal Ф and logistic A. The standard normal distribution function (see Section 5.2) is defined by

and the logistic distribution function is defined by

(13.5.5) A(x) = —————–

1 + /

When Ф is used in (13.5.1), the model is called probit, when Л is used, the model is called logit. The two distribution functions have similar shapes, except that the logistic has a slightly fatter tail than the standard normal.

To the extent that the econometrician experiments with various trans­formations of the original independent variables, as he normally would in a regression model, the choice of F is not crucial. To see this, suppose that the true distribution function is G, but the econometrician assumed it to be F. Then, by choosing a function h(-) appropriately, he can always satisfy G(x-P) = T[/i(x’p)].

It is important to remember that in model (13.5.1) the regression coefficients (З do not have any intrinsic meaning. The important quantity is, rather, the vector dF/dx,. If one researcher fits a given set of data using a probit model and another researcher fits the same data using a logit model, it would be meaningless to compare the two estimates of p. We must instead compare дФ/дх; with дЛ/дх,. In most cases these two deriva­tives will take very similar values.

The best way to estimate model (13.5.1) is by the maximum likelihood method. The likelihood function of the model is given by

П

(13.5.6) L = П^(х’РЯ1 – F(x’P)]1-*.

i= 1

When F is either Ф or A, the likelihood function is globally concave in p. Therefore, maximizing L with respect to P by any standard iterative method such as the Newton-Raphson (see Section 7.3.3) is straightfor­ward. Although we do not have the i. i.d. sample here because x* varies with i, we can prove the consistency and the asymptotic normality of the maximum likelihood estimator by an argument similar to the one pre­sented in Sections 7.4.2 and 7.4.3. The asymptotic distribution of the maximum likelihood estimator P is given by

(13.5.7) VT(P — P) —> N(0, A-1), where 1

A = lim — У’

n tx m – where / is the density function of F.