Multistate Models with Exogenous Variables
Theoretically, not much need be said about this model beyond what we have discussed in Section 11.1.1 for the general case and in Section 11.1.3 for the twostate case. The likelihood function can be derived from (11.1.4) by specifying Pjk(t) as a function of exogenous variables and parameters. The equivalence of the NLWLS to the method of scoring iteration was discussed for the
general case in Section 11.1.1, and the minimum chisquare estimator defined for the twostate case in Section 11.1.3 can be straightforwardly generalized to the multistate case. Therefore it should be sufficient to discuss an empirical article by Toikka (1976) as an illustration of the NLWLS (which in his case is a linear WLS estimator because of his linear probability specification).
Toikka’s model is a threestate Markov model of labor market decisions in which the three states (corresponding to j = 1, 2, and 3) are the state of being employed, the state of being in the labor force (actively looking for a job) and unemployed, and the state of being out of the labor force.
The exogenous variables used by Toikka consist of average (over individuals) income, average wage rate, and seasonal dummies, all of which depend on time (months) but not on individuals. Thus Toikka’s model is a homogeneous and nonstationary Markov model. Moreover, Toikka assumed that transition probabilities depend linearly on the exogenous variables.2 Thus, in his model, Eq. (11.1.7) can be written as
у'(O’ = [y’O – І)’ ® х,’]П + її'(O’,
which is a multivariate heteroscedastic linear regression equation. As we indicated in Section 11.1.1, the generalized least squares estimator of П is asymptotically efficient. _
Let Y, be the N X M matrix the ith row of which is у'(O’ and let Y, be the NX(M— 1) matrix consisting of the first M — 1 columns of Y,. Define U( similarly. Then we can write (11.1.67) as
(11.1.68)
To define the FGLS estimator of П, write (11.1.68) as Y = ХП + U and write the columns of Y, П, and U explicitly as Y = [ y,, y2,. . . , yc], П = [я1,л2,. . . ,nG], and U = [u1,u2,. . . , uG], where G = M— 1. Also define у = (уі, У2, • . . , Ус)’, я=(я;,^,. . . , тіс)’, and u = (ui, u2,. . . , Ug)’. Then (11.1.68) can be written as
у = Хя + u. (11.1.70)
The FGLS estimator of я is (Х’£2_1Х)_1Х’Д_Іу, where £2 is a consistent estimator of £2 = isuu’. Here, £2 has the following form:
Du D12 • • D1C
D2j D22 D2C
(11.1.71)
■
where each is a diagonal matrix of size NT. If each D,* were a constant times the identity matrix, (11.1.70) would be Zellner’s seemingly unrelated regression model (see Section 6.4), and therefore the LS estimator would be asymptotically efficient. In fact, however, the diagonal elements of D,*. are not constant.
Toikka’s estimator of П, denoted П, is defined by
Because (Y^iY,., )1YJ_,Y, is the first M— 1 columns of the unconstrained MLE of the Markov matrix P(/), Toikka’s estimator can be interpreted as the LS estimator in the regression of P(t) on xt. Although this idea may seem intuitively appealing, Toikka’s estimator is asymptotically neither more nor less efficient than П. Alternatively, Toikka’s estimator can be interpreted as premultiplying (11.1.68) by the blockdiagonal matrix
"(YoY0)1Yo
(YiYJY’,
L (Yr, Yr_,)~1 Y^_, J
and then applying least squares. If, instead, generalized least squares were applied in the last stage, the resulting estimator of П would be identical with the GLS estimator of П derived from (11.1.68).
11.1.2 Estimation Using Aggregate Data
Up to now we have assumed that a complete history of each individual, y){t) for every i, t, and j, is observed. In this subsection we shall assume that only the
aggregate data n/t) = 2jL, yj(f) are available. We shall first discuss LS and GLS estimators and then we shall discuss MLE briefly.
Suppose the Markov matrix P‘(t) is constant across i, so that P'(t) = P(r). Summing both sides of (11.1.7) over і yields
X y'(‘)=w X y‘(t 1)+x 5’W
(11.1.74)
where Ц; = P(r)’y'(t — 1). Depending on whether P(r) depends on unknown parameters linearly or nonlinearly, (11.1.73) defines a multivariate heterosce – dastic linear or nonlinear regression model. The parameters can be estimated by either LS or GLS (NLLS or NLGLS in the nonlinear case), and it should be straightforward to prove their consistency and asymptotic normality as NT
goes tO °°.
The simplest case occurs when P'(t) is constant across both і and t, so that P'(0 = P. If, moreover, P is unconstrained with M(M— 1) free parameters, (11.1.73) becomes a multivariate linear regression model. We can apply either a LS or a GLS method, the latter being asymptotically more efficient, as was explained in Section 11.1.4. See the article by Telser (1963) for an application of the LS estimator to an analysis of the market shares of three major cigarette brands in the period 19251943.
If P(t) varies with t, the ensuing model is generally nonlinear in parameters, except in Toikka’s model. As we can see from (11.1.68), the estimation on the basis of aggregate data is possible by LS or GLS in Toikka’s model, using the equation obtained by summing the rows of each Y,. A discussion of models where the elements of P(0 are nonlinear functions of exogenous variables and parameters can be found in an article by MacRae (1977). MacRae also discussed maximum likelihood estimation and the estimation based on incomplete sample.
In the remainder of this subsection, we shall again consider a twostate Markov model (see Section 11.1.3). We shall present some of the results we have mentioned so far in more precise terms and shall derive the likelihood function.
Using the same notation as that given in Section 11.1.3, we define r,= 1^.хУи. The conditional mean and variance of r, given rf_, are given by
(11.1.75)

vrt=x w – f*)=лчі – p)r,i+m – pmni).
il
(11.1.76)
The NLWLS estimation of у is defined as that which minimizes
І=І Vr,
where Vr, is obtained by estimating Pand P? by рфх, + d’x,) and F(f}’x,), respectively, where a and ji are consistent estimates obtained, for example, by minimizing SfL x(rt — Si1ir,,)2. Alternatively we can minimize
which will asymptotically give the same estimator.3 Let у be the estimator obtained by minimizing either (11.1.77) or (11.1.78). Then we have
The asymptotic variancecovariance matrix of у can be shown to be larger (in matrix sense) than that of у given in (11.1.38) as follows: The inverse of the latter can be also written as
(11.1.80)
Put zi( = [F,,(l – F^^dFJdy and a„ = [F*(l – Fu)]1’2. Then the desired inequality follows from
Finally, we can define the MLE, which maximizes the joint probability of r, and r,_[, t = 1, 2,. . . , T. The joint probability can be shown to be
The maximization of (11.1.82) is probably too difficult to make this estimator of any practical value. Thus we must use the NLLS or NLWLS estimator if only aggregate observations are available, even though the MLE is asymptotically more efficient, as we can conclude from the study of Barankin and Gurland (1951).
Leave a reply