Estimation
Hill et al. (2011) provides a subset of National Longitudinal Survey which is conducted by the U. S. Department of Labor. The database includes observations on women, who in 1968, were between the ages of 14 and 24. It then follows them through time, recording various aspects of their lives annually until 1973 and biannually afterwards. Our sample consists of 716 women observed in 5 years (1982, 1983, 1985, 1987 and 1988). The panel is balanced and there are 3580 total observations.
Two model considered is found in equation (15.2) below.
ln(wage)it = ви + в2 educit + в3 experit + в4 experft + въ tenureit
+вб tenure2) + e7southit + e8unionit + e9blackit + eit (15.4)
The main command used to estimate models with panel data in gretl is panel. syntax is:
panel
Arg u m ents: depvar indepvars
Options: —vcv (print covariance matrix)
—fixedeffects (estimate with group fixed effects)
—randomeffects (random effects or GLS model)
—between (estimate the betweengroups model)
—robust (robust standard errors; see below)
—timedumm. ies (include time dummy variables)
—unitweights (weighted least squares)
—iterate (iterative estimation)
—matrixdif f (use matrixdifference method for Hausman test) —quiet (less verbose output)
—verbose (more verbose output)
All of the basic panel data estimators are available. Fixed effects, twoway fixed effects, random effects, between estimation and (not shown) pooled least squares.
The first model to be estimated is referred to as pooled least squares. Basically, it imposes the restriction that ви = в 1 for all individuals. The individuals have the same intercepts. Applying pooled least squares in a panel is restrictive in a number of ways. First, to estimate the model using least squares violates at least one assumption that is used in the proof of the GaussMarkov theorem. It is almost certain that errors for an individual will be correlated. If Johnny isn’t the sharpest marble in the bag, it is likely that his earnings given equivalent education, experience, tenure and so on will be on the low side of average for each year. He has low ability and that affects each year’s average wage similarly.
It is also possible that an individual may have smaller of larger earnings variance compared to others in the sample. The solution to these specification issues is to use robust estimates of the variance covariance matrix. Recall that least squares is consistent for the slopes and intercept (but not efficient) when errors are correlated or heteroskedastic, but that this changes the nature of the variancecovariance.
Robust covariances in panel data take into account the special nature of these data. Specifically they account for autocorrelation within the observations on each individual and they allow the variances for different individuals to vary. Since panel data have both a timeseries and a cross
sectional dimension one might expect that, in general, robust estimation of the covariance matrix would require handling both heteroskedasticity and autocorrelation (the HAC approach).
Gretl currently offers two robust covariance matrix estimators specifically for panel data. These are available for models estimated via fixed effects, pooled OLS, and pooled twostage least squares. The default robust estimator is that suggested by Arellano (2003), which is HAC provided the panel is of the “large n, small T” variety (that is, many units are observed in relatively few periods).
In cases where autocorrelation is not an issue, however, the estimator proposed by Beck and Katz (1995) and discussed by Greene (2003, chapter 13) may be appropriate. This estimator takes into account contemporaneous correlation across the units and heteroskedasticity by unit.
1 open "@gretldirdatapoenls_panel. gdt"
2 list xvars = const educ exper exper2 tenure tenure2 south black union
3 panel lwage xvars —pooled —robust
The first thing to notice is that even though the model is being estimated by least squares, the panel command is used with the —pooled option. The —robust option requests the default HCCME for panel data which is basically a special version of HAC (see section 9.6.1).
Pooled OLS, using 3580 observations
Included 716 crosssectional units
Timeseries length = 5
Dependent variable: lwage
Robust (HAC) standard errors
Coefficient 
Std. Error 
fratio 
pvalue 

const 
0.476600 
0.0844094 
5.6463 
0.0000 
educ 
0.0714488 
0.00548952 
13.0155 
0.0000 
exper 
0.0556851 
0.0112896 
4.9324 
0.0000 
exper2 
0.00114754 
0.000491577 
2.3344 
0.0196 
tenure 
0.0149600 
0.00711024 
2.1040 
0.0354 
tenure2 
0.000486042 
0.000409482 
1.1870 
0.2353 
south 
0.106003 
0.0270124 
3.9242 
0.0001 
black 
0.116714 
0.0280831 
4.1560 
0.0000 
union 
0.132243 
0.0270255 
4.8933 
0.0000 
p 0.811231 DurbinWatson 
0.337344 
As long as omitted effects (e. g., individual differences) are uncorrelated with any of the regressors, these estimates are consistent. If the individual differences are correlated with regressors, then you can estimate the model’s parameters consistently using fixed effects.
For comparison purposes, the pooled least squares results without cluster standard errors is shown
Pooled OLS, using 3580 observations
Included 716 crosssectional units
Timeseries length = 5
Dependent variable: lwage
Coefficient 
Std. Error 
tratio 
pvalue 

const 
0.476600 
0.0561559 
8.4871 
0.0000 
educ 
0.0714488 
0.00268939 
26.5669 
0.0000 
exper 
0.0556851 
0.00860716 
6.4696 
0.0000 
exper2 
0.00114754 
0.000361287 
3.1763 
0.0015 
tenure 
0.0149600 
0.00440728 
3.3944 
0.0007 
tenure2 
0.000486042 
0.000257704 
1.8860 
0.0594 
south 
0.106003 
0.0142008 
7.4645 
0.0000 
black 
0.116714 
0.0157159 
7.4265 
0.0000 
union 
0.132243 
0.0149616 
8.8388 
0.0000 
Leave a reply