# Maximum Likelihood Estimation

Under normality of the disturbances, one can write the log-likelihood function as NT N 1

L(a, в, ф2, cr2v) = constant—– logaV +— log^2——- 2 UE-1u

2 2 2^v

where Q = aVE, ф2 = aV/a and E = Q + ф-2P from (12.18). This uses the fact that |Q| = product of its characteristic roots = (aV)N(T-1)(a2)N = (a2v)NT(ф2)-И. Note that there is a one-to-one correspondence between ф2 and a2^. In fact, 0 < a2^ < ж translates into 0 < ф2 < 1. Brute force maximization of (12.32) leads to nonlinear first-order conditions, see Amemiya (1971). Instead, Breusch (1987) concentrates the likelihood with respect to a and aV. In this case, 33MLE = У.. — X.3 MLE and 33‘V, MLE = 3’E-1u/NT where 3 and E are based on MLE’s of в, ф2 and a. Let d = y — Xf3MLE then aMLE = i’NTd/NT and и = d — iNTa = d — JNTd. This implies that d2V MLE can be rewritten as 3V, MLE = d/[Q + ф2(Р — JNT)]d/NT

and the concentrated log-likelihood becomes

NT – N

Lc(e, ф2) = constant—- 2~log{d'[Q + ф2(Р — Jnt)]d} + у^ф2 (12.34)

Maximizing (12.34), over ф2 given в, yields

32 =_______ d’Qd = Za=i t=i(dit di.)2 (12 35)

ф (T — 1)d'(P — Jnt )d T (T — 1) N^d. — d..)2 ‘ )

Maximizing (12.34) over в, given ф2, yields

3mle = {X'[Q + ф2(Р — Jnt )]*}-1X ‘[Q + ф2(Р — Jnt )]y (12.36)

One can iterate between в and ф2 until convergence. Breusch (1987) shows that provided T > 1, any г-th iteration в, call it ві, gives 0 < ф2+1 < ж in the (i+1)th iteration. More importantly,

Breusch (1987) shows that these ф2Т have a remarkable property of forming a monotonic se­quence. In fact, starting from the Within estimator of в, for ф2 = 0, the next ф2 is finite and positive and starts a monotonically increasing sequence of ф2’s. Similarly, starting from the Between estimator of в, for (ф2 ^ те) the next ф2 is finite and positive and starts a monotoni – cally decreasing sequence of ф2Т. Hence, to guard against the possibility of a local maximum, Breusch (1987) suggests starting with вшиып and в Between and iterating. If these two sequences converge to the same maximum, then this is the global maximum. If one starts with воьв for ф2 = 1, and the next iteration obtains a larger ф2, then we have a local maximum at the bound­ary ф2 = 1. Maddala (1971) finds that there are at most two maxima for the likelihood Ь(ф2) for 0 < ф2 < 1. Hence, we have to guard against one local maximum.

12.2 Prediction

Suppose we want to predict S periods ahead for the i-th individual. For the random effects model, the BLU estimator is GLS. Using the results in Chapter 9 on GLS, Goldberger’s (1962) Best Linear Unbiased Predictor (BLUP) of Vi, T+s is

Vi, T+s = Z^T+s^OLS + w’Q-1ugls for S > 1 (12.37)

where uOLS = y — ZSOLS and w = E(ui}T+su). Note that ui, T+s = Hi + v i, T+s

and w = a‘,(li Z it) where li is the i-th column of IN, i. e., li is a vector that has 1 in the i-th    position and zero elsewhere. In this case

since (li Z l’t) P = (li Z l’t) and (li Z l’t)Q = 0. The typical element of w’Q 1uols is

(Tafl/a21)Ui. oLs where Ui, oLs = ET=1 Uit, OLs/T. Therefore, in (12.37), the BLUP for yi, T+s corrects the GLS prediction by a fraction of the mean of the GLS residuals corresponding to that i-th individual. This predictor was considered by Wansbeek and Kapteyn (1978) and Taub (1979).