# Panel Data with Qualitative Variables

Early reviews of models of panel data with qualitative variables are in Heckman

(1981) and Chamberlain (1980, 1984). This work in the early 1980s is reviewed in Maddala (1987) which discusses fixed effects logit and probit models, random effects probit models, autoregressive logit models, autoregressive probit models, and the problems of serial correlation and state dependence.

The main issue in panel data models with qualitative response variables is that the presence of individual effects (heterogeneity) complicates the estimation. For instance, the fixed effects model for panel data does not give consistent estimates of the slope parameters. Chamberlain (1980) suggested a conditional ML approach in which the likelihood function of the model is conditioned on sufficient statistics of the incidental parameters a; (i. e. the fixed effects), which in this case are Y. tVu – The method is illustrated in Maddala (1987) for the logit model. The probit model cannot be used with Chamberlain’s method due to computational problems.

In contrast, when using the random effects approach, the probit model is about the only choice, since on the multivariate logistic distribution the correlations between the error term and the cross section units are all constrained to be 1/2 (see Johnson and Kotz, 1972, pp. 293-4), which is implausible to assume in most applications. The random effects probit model implies serial correlation of a specific nature, the equicorrelation model, for which there is an efficient ML algorithm due to Butler and Moffitt (1982). For the case of a general type of serial correlation, the Avery, Hansen, and Hotz (1983) method of GMM was discussed in Maddala (1987).

An additional problem in panel data models with qualitative response variables occurs if there is "state dependence", which can be interpreted as a situation in which an individual’s past state yi/t-1 helps predict his or her current state yit, after allowing for individual effects.

The problem of discrete choice panel data models with lagged dependent variables was first discussed in Chamberlain (1985). The model considered is a model with heterogeneity and state dependence. Heterogeneity is captured by the inclusion of individual specific effects ai, and state dependence is captured by the inclusion of the lagged dependent variable yit-1. The model considered by Chamberlain is:

P( yio = 1/a i) = Po(a)

P(yit = 1/ai, yio, ya,…, y,-M)

= exp(ai + YЯн) t = 1, 2 … t, T > 3. (17.13)

1 + exp(a i + y y{/t-1)

The procedure used by Chamberlain (1985) is to derive a set of probabilities that do not depend on the individual effects. Let T = 3 and consider the events:

A = {yio = d0, yn = 0, ya = 1, yіз = d3}

B = {yi0 = d0, yi1 = 1, yi2 = 0, yi3 = d3} (17_14)

where d0 and d3 are either 0 or 1. Then the conditional probabilities P(A/ai,

A U B) and P(B/ai, A U B) will not depend on ai. However, this method will not

work in the presence of explanatory variables.

Honore and Kyriazidou (1998), to be referred to as H-K, consider a more general model with exogenous explanatory variables. Their model is:

P(yi0 = 1/Xi, a,) = P0(xi, a,)

P(yit = 1/Xi, a,, y,0, y,1,…, y,>1)

= exp (xв + y yi/t-1 + a і )

1 + exp(x! tp + y y,,t-1 + a,)

H-K observe that if say xi2 = xi3 then these conditional probabilities do not depend on a,. In practice this is not a reasonable assumption. In the special case where all explanatory variables are discrete, and P(xi2 = xi3) > 0, one may estimate в and у by maximizing with respect to b and g the weighted likelihood function:

x3 = 0) ln exp{(xi 1 – xi2 )b + g(УІ0 – Уі3)}Уі1 і3 1 + exp{(x, 1 – x,2)b + g( y,0 – Уі з)}

(17.16)

where I() are indicator functions. The resulting estimator will have all the usual properties (consistency and root-n asymptotic normality).

In the case where the xs are not discrete, H-K suggest replacing the indicator function I(xi2 – xi3 = 0) by a weight function k[(xi2 – xi3)/aj with weights depending inversely on the magnitude of xi2 – xi3 giving more weight to observations for which xi2 is "close" to xi3. an is a bandwidth that shrinks as n increases and k( ) is the kernel, chosen so that k(v) ^ 0 as v ^ ^.

H-K show that the resultant estimators are consistent and asymptotically normal, although their rate of convergence will be slower than n~2 and will depend on the number of covariates in xit. Furthermore, for identification, their model requires that xi2 – xi3 be continuously distributed with support in a neighborhood of 0, and that xi1 – xi2 has sufficient variation conditional on the event xi2 – xi3 = 0. H-K also extend their model for the general case of M alternatives.

## Leave a reply