# Dynamic Panel Data Models

The dynamic error components regression is characterized by the presence of a lagged dependent variable among the regressors, i. e.,

yit = %,t-1 + xitP + hi + Vit, i = 1,…,N; t = 1,…,T (12.59)

where 8 is a scalar, x’it is 1 x K and в is K x 1. This model has been extensively studied by Anderson and Hsiao (1982). Since yit is a function of hi, y-i, t-1 is also a function of hi. Therefore, yi>t-1, a right hand regressor in (12.59), is correlated with the error term. This renders the OLS estimator biased and inconsistent even if the vit’s are not serially correlated. For the FE estimator, the within transformation wipes out the hi’s, but yi>t-1 will still be correlated with uit even if the vit’s are not serially correlated. In fact, the Within estimator will be biased of 0(1/T) and its consistency will depend upon T being large, see Nickell (1981). An alternative transformation that wipes out the individual effects, yet does not create the above problem is the first difference (FD) transformation. In fact, Anderson and Hsiao (1982) suggested first differencing the model to get rid of the hi’s and then using Ayj,,t-2 = (yi}t-2 — yi,,t-3) or simply yi>t-2 as an instrument for Ayi}t-1 = (yi}t-1 — yi,,t-2). These instruments will not be

correlated with Avit = vi>t—vi, t-i, as long as the vit’s themselves are not serially correlated. This instrumental variable (IV) estimation method leads to consistent but not necessarily efficient estimates of the parameters in the model. This is because it does not make use of all the available moment conditions, see Ahn and Schmidt (1995), and it does not take into account the differenced structure on the residual disturbances (Avit). Arellano (1989) finds that for simple dynamic error components models the estimator that uses differences Ayitt-2 rather than levels yi}t-2 for instruments has a singularity point and very large variances over a significant range of parameter values. In contrast, the estimator that uses instruments in levels, i. e., yift-2, has no singularities and much smaller variances and is therefore recommended. Additional instruments can be obtained in a dynamic panel data model if one utilizes the orthogonality conditions that exist between lagged values of yit and the disturbances vit.

Let us illustrate this with the simple autoregressive model with no regressors:

yit = %,t-i + Uit i = 1,…,N t = 1,…,T (12.60)

where uit = yi + vit with yi ~ IID(0, a2^) and vit ~ IID(0, a“l), independent of each other and among themselves. In order to get a consistent estimate of 8 as N with T fixed, we first difference (12.60) to eliminate the individual effects

yit – Уі, t-i = 8(yi, t-i – yi, t-2) + (vit – vi, t-i) (12.61)

and note that (vit — vi, t-i) is MA(1) with unit root. For the first period we observe this relationship, i. e., t = 3, we have

УіЗ — Уі2 = 8(Уі2 — yii) + (vi3 — vi2)

In this case, yii is a valid instrument, since it is highly correlated with (yi2 — yii) and not correlated with (vi3 — vi2) as long as the vit are not serially correlated. But note what happens for t = 4, the second period we observe (12.61):

Уі4 — УіЗ = 8(УіЗ — Уі2) + (v i4 — vi3)

In this case, yi2 as well as yii are valid instruments for (yi3 — yi2), since both yi2 and yii are not correlated with (vi4 — vi3). One can continue in this fashion, adding an extra valid instrument with each forward period, so that for period T, the set of valid instruments becomes (Уi1,Уi2, . . . ,Уі, T-2).

This instrumental variable procedure still does not account for the differenced error term in (12.61). In fact,

E(Avi Avi) = a2vG

= (vi3 |
— v i2, |
…,viT — |
vi, T |
i) |
and |

2 |
— 1 |
0 ••• |
0 |
0 |
0 |

—1 |
2 |
— 1 ••• |
0 |
0 |
0 |

0 |
0 |
0 ••• |
— 1 |
2 |
— 1 |

0 |
0 |
0 ••• |
0 |
—1 |
2 |

G = |

is (T — 2) x (T — 2), since Avi is MA(1) with unit root. Define

[Уіі, • • •,Vi, T—2]

Then, the matrix of instruments is W = [W{, •••, W’N]’ and the moment equations described above are given by E(W’ Avi) = 0^ Premultiplying the differenced equation (12.61) in vector form by W’, one gets

W ‘Ay = W'(Av—i)6 + W’Av

Performing GLS on (12.64) one gets the Arellano and Bond (1991) preliminary one-step consistent estimator

?i = [(Ay—i)’W(W'(In ® G)W)—lW'(Ay—1)]—1 (12.65)

x [(Ay—i)’W(W'(In ® G)W)—lW'(Ay)]

The optimal generalized method of moments (GMM) estimator of 6l a la Hansen (1982) for N and T fixed using only the above moment restrictions yields the same expression as in (12.65) except that

N

W'(In ® G)W = ^2 W’GWi

i=l

is replaced by

N

Vn = Y^ W(AVi)(Avi)’W

i=l

This GMM estimator requires no knowledge concerning the initial conditions or the distributions of v i and ^. To operationalize this estimator, Av is replaced by differenced residuals obtained from the preliminary consistent estimator 8i. The resulting estimator is the two-step Arellano and Bond (1991) GMM estimator:

£2 = [(Ay—i)’WV—1W’ (Av—i)]—l[(Av—i)’WV]-lW'(Ay)] (12.66)

A consistent estimate of the asymptotic var(82) is given by the first term in (12.66),

var(?2) = [(Ay—i)’WV—lW’ (Ay—i)]—1 (12.67)

Note that 8l and 82 are asymptotically equivalent if the vit are HD(0,a^).

If there are additional strictly exogenous regressors xit as in (12.59) with E(xuvis) = 0 for all t, s = 1,2,•••,T, but where all the xit are correlated with then all the xit are valid instruments for the first differenced equation of (12.59). Therefore, [xil, xi2, • • • ,x’iT] should be added to each diagonal element of Wi in (12.63). In this case, (12.64) becomes

W ‘Ay = W'(Ay—i)8 + W ‘(AX )0 + W’Av

where AX is the stacked N(T — 2) x K matrix of observations on Axit. One – and two-step estimators of (6,0)can be obtained from

as in (12.65) and (12.66).

Arellano and Bond (1991) suggest Sargan’s (1958) test of over-identifying restrictions given by

where p refers to the number of columns of W and AV denote the residuals from a two-step estimation given in (12.68).

To summarize, dynamic panel data estimation of equation (12.59) with individual fixed effects suffers from the Nickell (1981) bias. This disappears only if T tends to infinity. Alternatively, a GMM estimator was suggested by Arellano and Bond (1991) which basically differences the model to get rid of the individual specific effects and along with it any time invariant regressor. This also gets rid of any endogeneity that may be due to the correlation of these individual effects and the right hand side regressors. The moment conditions utilize the orthogonality conditions between the differenced errors and lagged values of the dependent variable. This assumes that the original disturbances are serially uncorrelated. In fact, two diagnostics are computed using the Arellano and Bond GMM procedure to test for first order and second order serial correlation in the disturbances. One should reject the null of the absence of first order serial correlation and not reject the absence of second order serial correlation. A special feature of dynamic panel data GMM estimation is that the number of moment conditions increase with T. Therefore, a Sargan test is performed to test the over-identification restrictions. There is convincing evidence that too many moment conditions introduce bias while increasing efficiency. It is even suggested that a subset of these moment conditions be used to take advantage of the trade-off between the reduction in bias and the loss in efficiency, see Baltagi (2008) for details.

Arellano and Bond (1991) apply their GMM estimation and testing methods to a model of employment using a panel of 140 quoted UK companies for the period 1979-84. This is the benchmark data set used in Stata to obtain the one-step and two-step estimators described in (12.65) and (12.66) as well as the Sargan test for over-identification using the command (xtabond, twostep).The reader is asked to replicate their results in problem 22.

## Leave a reply