Hausman’s Specification Tes

A critical assumption for the linear regression model y = X@ + u is that the set of regressors X are uncorrelated with the error term u. Otherwise, we have simultaneous bias and OLS is inconsistent. Hausman (1978) proposed a general specification test for Ho; E(u/X) = 0 versus Hi; E(u/X) = 0. Two estimators are needed to implement this test. The first estimator must be a consistent and efficient estimator of в under Ho which becomes inconsistent under Hi. Let us denote this efficient estimator under Ho by f3o. The second estimator, denoted by /3i, must be consistent for в under both Ho and Hi, but inefficient under Ho. Hausman’s test is based on the difference between these two estimators 3 = в1 — /3o. Under Ho; plim 3 is zero, while under H1; plim 3 = 0. Hausman (1978) shows that var(3) = var(ei)— var(eo) and Hausman’s test becomes

m = 3J'[var(3)]_i(3 (11.52)

which is asymptotically distributed under Ho as xk where k is the dimension of в.

It remains to show that var(3) is the difference between the two variances. This can be illustrated for a single regressor case without matrix algebra, see Maddala (1992, page 507). First, one shows that cov(eo,3) = 0. To prove this, consider a new estimator of в defined as

в = вo + A3 where A is an arbitrary constant. Under Ho, plim в = в for every A and var(/3) = var(/?o) + A2var(3) + 2Acov(/3o, 3)

Since вo is efficient, var^) > var(вс) which means that A2 var(3) + 2A cov^o, 3) > 0 for every A. If cov(/?o, 3) > 0, then for A = —cov(/3o, 3)/var(3) the above inequality is violated. Similarly, if cov(вo, 3) < 0, then for A=-cov(вo, 3)/var(3) the above inequality is violated. Therefore, under Ho, for the above inequality to be satisfied for every A, it must be the case that cov(/3o, 3) = 0.

Now, 3 = в-_ —вo can be rewritten as в-_ = 3+вo with var^) = var(3)+ var(вo)+2cov(з, вс). Using the fact that the last term is zero, we get the required result: var(3) = var^i)— var^o).

Example 4: Consider the simple regression without a constant

yt = + ut t = 1, 2,…,T with ut ~ IIN(0,a2)

and where в is a scalar. Under Ho; E(ut/xt) = 0 and OLS is efficient and consistent. Under Hp OLS is not consistent. Using wt as an instrumental variable, with wt uncorrelated with ut and preferably highly correlated with xt, yields the following IV estimator of в:


вIV = ETt=i ytWt/Y, T=1 XtWt = в + ETt=i wtut/T, Tt=i XtWt

Подпись: var(3) = var(3IV) — var(3oLS) image446 image447 Подпись: 1 Подпись: var(3OLS ) image450

with plim POLS = в under Ho but plim POLS = в under Hi, with var((3OLS) = a2/E’T=i x2. One can see that var^OLS) < var(вIV) since 0 < rXw < 1. In fact for a weak IV, rX. w is small and close to zero and the var^IV) blows up. Strong IV is where rXw is close to 1 and the closer is the var((3IV) to var(вOLS). In this case, q = f3IV — /3OLS and plim q = 0 under Hi, while plim 3 = 0 under Ho, with

Therefore, Hausman’s test statistic is

m = 32 rlw/var( 13ols )(1 — rlw )

which is asymptotically distributed as x2 under Ho. Note that the same estimator of a2 is used for var^^) and var(вOLS). This is the estimator of a2 obtained under Ho.

The Hausman-test can also be obtained from the following augmented regression:

yt = вxt + j3t + et

where 3t is the predicted value of xt from regressing it on the instrumental variable wt. Problem 13 asks the reader to show that Hausman’s test statistic can be obtained by testing 7 = 0.

Подпись: plim3IV image452 Подпись: в + corr(wt,ut) ax corr(xt,wt) au

The IV estimator assumes that wt and ut are uncorrelated. If this is violated in practice, then the IV estimator is inconsistent and the asymptotic bias can be aggravated if in addition this is a weak instrument. To see this,

where cov(wt, ut) and corr(xt, wt) denote the population covariance and correlation, respec­tively. Also, ax and au denote the population standard deviations. If corr(wt, ut) = 0, there is an asymptotic bias in вIV. If corr(wt, ut) is small and the instrument is strong (with a large corr(xt, wt)), this bias could be small. However, this bias could be large, even if corr(wt, ut) is

small, in the case of weak instruments (small corr(xt, wt)). This warns researchers of using an instrument which they deem much better than xt because it is less correlated with ut. If this instrument is additionally weak, the bias of the resulting IV estimator will be enlarged due to the smaller corr(xt, wt). In sum, weak instruments may have large asymptotic bias in practice even if they are slightly correlated with the error term.

In matrix form, the Durbin-Wu-Hausman test for the first structural equation is based upon the difference between OLS and IV estimation of (11.34) using the matrix of instruments W. In particular, the vector of contrasts is given by

Z = 6ltiv – ols = Z PwZi)-1[ZlPwyi – (Z1 PwZ1)(Z’1 Z1)-1Z,1 yi] (11.53)

= (Zi Pw Zi)-1[Zi Pw Pz1 yi]

Under the null hypothesis, q = (Z1 PwZ1)-1Z1 PwPZlu1. The test for q = 0 can be based on the test for Z1 Pw PZl u1 having mean zero asymptotically. This last vector is of dimension (g1 + k1). However, not all of its elements are necessarily random variables since Pz1 may annihilate some columns of the second stage regressors Zq1 = PwZ1. In fact, all the included X’s which are part of W, i. e., X1, will be annihilated by Pz1. Only the g1 linearly independent variables Y1 = PwY1 are not annihilated by Pz1.

Our test focuses on the vector Y1 Pz1 u1 having mean zero asymptotically. Now consider the artificial regression

y1 = Z161 + Yy + residuals (11.54)

Since [Z1,Y1], [Z1,Z1], [Z1,Z1 — Z1] and [Z1,Y1 — Y/1] all span the same column space, this

regression has the same sum of squares residuals as

y1 = Z161 + (Y1 — YDg + residuals (11.55)

The DWH-test may be based on either of these regressions. It is equivalent to testing 7 = 0 in (11.54) or n = 0 in (11.55) using an F-test. This is asymptotically distributed as F(g1,T — 2g1 — k1). Davidson and MacKinnon (1993, p. 239) warn about interpreting this test as one of exogeneity of Y1 (the variables in Z1 not in the space spanned by W). They argue that what is being tested is the consistency of the OLS estimates of 81, not that every column of Z1 is independent of u1.

In practice, one may be sure about using W2 as a set of IV’s but is not sure whether some r additional variables in Z1 are legitimate as instruments. The DWH-test in this case will be based upon the difference between two IV estimators for 81. The first is 62,IV based on W2 and the second is S1>IV based on W1. The latter set includes W2 and the additional r variables in Z1.

&2,iv — $1,iv = (Z1 pW2 z1)-1[z1 pW2 y1 — (Z1 pW2 z1)(z1 pWi z1)-1z1 pWi y1] (11.56)

= (Z1pW2 Z1)-1z1pW2 (pPw1z1 )y1

since PW2 Pw1 = PW2. The DWH-test is based on this contrast having mean zero asymptotically. Once again this last vector has dimension g1 + k1 and not all its elements are necessarily random variables since PPwiZl annihilates some columns of PW2 Z1. This test can be based on the following artificial regression:

y1 = Z161 + PW2 Z^y + residuals (11.57)

where PW2 Z{ consists of the r columns of PW2 Z1 that are not annihilated by PW1. Regression (11.57) is performed with W1 as the set of IV’s and 7 = 0 is tested using an F-test.

11.4 Empirical Examples

Example 1: Crime in North Carolina. Cornwell and Trumbull (1994) estimated an economic model of crime using data on 90 counties in North Carolina observed over the years 1981-87. This data set is available on the Springer web site as CRIME. DAT. Here, we consider cross­section data for 1987 and reconsider the full panel data set in Chapter 12. Table 11.2 gives the OLS estimates relating the crime rate (which is an FBI index measuring the number of crimes divided by the county population) to a set of explanatory variables. This was done using Stata. All variables are in logs except for the regional dummies. The explanatory variables consist of the probability of arrest (which is measured by ratio of arrests to offenses), probability of conviction given arrest (which is measured by the ratio of convictions to arrests), probability of a prison sentence given a conviction (measured by the proportion of total convictions resulting in prison sentences); average prison sentence in days as a proxy for sanction severity. The number of police per capita as a measure of the county’s ability to detect crime, the population density which is the county population divided by county land area, a dummy variable indicating whether the county is in the SMSA with population larger than 50,000. Percent minority, which is the

Table 11.2 Least Squares Estimates: Crime in North Carolina





Number of obs F(20,69)

Prob > F R-squared Adj R-squared Root MSE

= 90 = 19.71 = 0.0000 = 0.8510 = 0.8078 = .24054















Std. Err.


P> |t|

[95% Conf. Interval]
























































































































































Number of obs F(20,69)

Prob > F R-squared Adj R-squared Root MSE

= 90 = 17.35 = 0.0000 = 0.8446 = 0.7996 = .24568















Std. Err.


P> |t|

[95% Conf. Interval]


— .4393081














— .2713278







— .0278416




— .2838482



— .280122





— .0033776













— .137

















— .3540903




— .8752731








— .0672322






— .263771







— .3858079







— .615124



— .4336541




























— .1792662




— .3314437

— .0270888


— .1139416







— 1.159015






Instrumented: lprbarr lpolpc

Instruments: lprbconv lprbpris lavgsen ldensity lwcon lwtuc lwtrd lwfir lwser lwmfg lwfed

lwsta lwloc lpctymle lpctmin west central ltaxpc lmix

proportion of the county’s population that is minority or non-white. Percent young male which is the proportion of the county’s population that is males and between the ages of 15 and 24. Regional dummies for western and central counties. Opportunities in the legal sector captured by the average weekly wage in the county by industry. These industries are: construction; transportation, utilities and communication; wholesale and retail trade; finance, insurance and real estate; services; manufacturing; and federal, state and local government.

Results show that the probability of arrest as well as conviction given arrest have a negative and significant effect on the crime rate with estimated elasticities of —0.45 and —0.30 respec­tively. The probability of imprisonment given conviction as well as the sentence severity have a negative but insignificant effect on the crime rate. The greater the number of police per capita, the greater the number of reported crimes per capita. The estimated elasticity is 0.36 and it is significant. This could be explained by the fact that the larger the police force, the larger the reported crime. Alternatively, this could be an endogeneity problem with more crime resulting








sqrt(diag(V b-V B)) S. E.


-.4393081 –










-.2713278 –





-.0278416 –





-.280122 –

























-.3540903 –





-.2911556 –















.0037846 –





-.4336541 –















-.0952899 –





-.1792662 –





-.1139416 –




b = consistent under Ho and Ha; obtained from ivreg

B = inconsistent under Ha, efficient under Ho; obtained from regress Test: Ho: difference in coefficients not systematic

chi2(20) = (b-B)’[(V b-V B)"(-1)](b-B)

= 0.87

Prob > chi2 = 1.0000

in the hiring of more police. The higher the density of the population the higher the crime rate. The estimated elasticity is 0.31 and it is significant. Returns to legal activity are insignificant except for wages in the service sector. This has a negative and significant effect on crime with an estimated elasticity of -0.27. Percent young male is insignificant, while percent minority is positive and significant with an estimated elasticity of 0.22. The central dummy variable is negative and significant while the western dummy variable is not significant. Also, the urban dummy variable is insignificant. Cornwell and Trumbull (1994) worried about the endogeneity of police per capita and the probability of arrest. They used as instruments two additional vari­ables. Offense mix which is the ratio of crimes involving face to face contact (such as robbery, assault and rape) to those that do not. The rationale for using this variable is that arrest is facilitated by positive identification of the offender. The second instrument is per capita tax revenue. This is justified on the basis that counties with preferences for law enforcement will vote for higher taxes to fund a larger police force.


The 2SLS estimates are reported in Table 11.3. The probability of arrest has an estimated elasticity of —0.44 but now with a p-value of 0.057. The probability of conviction given arrest has an estimated elasticity of —0.27 still significant. The probability of imprisonment given conviction is still insignificant while the sentence severity is now negative and significant with an estimated elasticity of —0.28. Police per capita has a higher elasticity of 0.51 still signif­icant. The remaining estimates are slightly affected. In fact, the Hausman test based on the difference between the OLS and 2SLS estimates is shown in Table 11.4. This is computed us­ing Stata and it contrasts 20 slope coefficient estimates. The Hausman test statistic is 0.87 and is asymptotically distributed as x2o. This is insignificant, and shows that the 2SLS and OLS estimates are not significantly different given this model specification and the specific choice of instruments. Note that this is a just-identified equation and one cannot test for over­identification.

Note that the 2sls estimates and the Hausman test are not robust to heteroskedasticity. Table 11.5 gives the 2sls estimates by running ivregress in Stata with the robust variance – covariance matrix option.

Note that the estimates remain the same as Table 11.3 but the standard errors for the right hand side endogenous variables Yi are now larger. The option (estat endogenous) generates the Hausman test which tests whether 2sls is different from OLS based now on the robust variance – covariance estimate. The F(2,67) statistic observed is 0.455 and has a p-value 0.636 which is not significant. This F-statistic could have been generated by an artificial regression as follows: Obtain the residuals from the first stage regressions, see Tables 11.6 and 11.7 below. Call these residuals vlhat and v2hat. Include them as additional variables in the original equation and run robust least squares. The robust Hausman test is equivalent to testing that the coefficients of these two residuals are jointly zero. The Stata commands (without showing the output of this artificial regression) and the resulting test statistic are shown below:

. quietly regress lcrmrte lprbarr lprbconv lprbpris lavgsen lpolpc ldensity lwcon lwtuc lwtrd lwfir lwser lwmfg lwfed lwsta lwloc lpctymle lpctmin west central urban v1hat v2hat if year==87, vce(robust)

. test v1hat v2hat

(1) v1hat = 0

(2) v2hat = 0

F(2,67) = 0.46

Prob > F = 0.6361

This is the same statistic obtained above with estat endogenous.

Tables 11.6 and 11.7 give the first-stage regressions for the probability of arrest and police per capita. The R2 of these regressions are 0.47 and 0.56, respectively. The F-statistics for the significance of all slope coefficients are 3.11 and 4.42, respectively. The additional instruments (offense mix and per capita tax revenue) are jointly significant in both regressions yielding F – statistics of 5.78 and 10.56 with p-values of 0.0048 and 0.0001, respectively. Although there are two right hand side endogenous regressors in the crime equation rather than one, the Stock and Watson ‘rule of thumb’ suggest that these instruments may be weak.

Instrumental variables (2SLS) regression

Number of obs =


Wald chi2(20) =


Prob > chi2 =


R-squared =


Root MSE =




Robust Std. Err.


P> z

[95% Conf. Interval]




















































































































































Instrumented: lprbarr lpolpc

Instruments: lprbconv lprbpris lavgsen ldensity lwcon lwtuc lwtrd lwfir lwser lwmfg lwfed lwsta

lwloc lpctymle lpctmin west central urban ltaxpc lmix

One could obtain a lot of diagnostics on weak instruments after (ivregress 2sls) in Stata by issuing the command (estat firststage). This is done in Table 11.8. The option forcenonrobust is forcing these diagnostics to be done for a robust regression where the econometric theory behind their derivation need not apply.

This is a just-identified equation, so we cannot test over-identification. We have already seen the R-squared of the first stage regressions (0.47 and 0.56). These are not low enough to flag possible weak instruments. But these R-squared measures may be due mostly to the inclusion of the right hand side exogenous variables X1. We want to know the additional contribution of the instruments X2 = (ltaxpc lmix) over and above X1. The partial R-squared provide such measures and yield lower numbers. For lprbarr, this is 0.32. This is the correlation between lprbarr and the instruments X2 = (ltaxpc lmix) after including the right hand side exogenous variables Xi. The F(2,69) statistic tests the joint significance of the excluded instruments X2 in the first stage regressions. These F-statistics of 6.58 and 6.68 are certainly less than 10.





Number of obs F(20,69)

Prob > F R-squared Adj R-squared Root MSE

= 90 = 3.11 = 0.0002 = 0.4742 = 0.3218 = 0.33174















Std. Err.


P> |t|

[95% Conf. Interval]




















































































































































This is the rule of thumb suggested by Stock and Watson for the case of one right hand side endogenous variable. Note that we computed these statistics above but without the robust variance-covariance matrix option. Shea’s partial R-squared of 0.135 for lprbarr is the R-squared from running a regression of residuals on residuals. The first residuals come from regressing lprbarr on the right hand side included exogenous variables X1. The second set of residuals come from regressing the right hand side included exogenous variables Xi on the set of instruments X2 = (ltaxpc lmix).

Because we have more than one right hand side endogenous variable, Stock and Yogo (2005) suggest using the minimum eigenvalue of a matrix analog of the F-statistic originally proposed by Cragg and Donald (1993) to test for identification. A low minimum eigenvalue statistic indicate weak instruments. If there is only one right hand side endogenous variable, this reverts back to the F-statistic which is compared to 10 by the ad hoc rule of Stock and Watson. The critical values for this minimum eigenvalue statistic are dependent on how much relative bias we are willing to tolerate relative to OLS in case of weak instruments. This is available only when the degree of over-identification is two or more. This is why Stata does not report it in this just-identified example. However, Stata does report a second test which applies even for the





Number of obs F(20,69)

Prob > F R-squared Adj R-squared Root MSE

= 90 = 4.42 = 0.0000 = 0.5614 = 0.4343 = 0.28148















Std. Err.


P> |t|

[95% Conf. Interval]




















































































































































just-identified case. This is based on size distortions of the Wald test for the joint significance of the right hand side endogenous variables Yi at the 5% level. The observed minimum eigenvalue statistic of 5.31 is between the critical values of 7.03 and 4.58 and indicates a 2SLS relative bias of more than 10% and less than 15% when it should be 5%.

Example 2: Growth and Inequality Reconsidered. Lundberg and Squire (2003) estimate a two equation model of growth and inequality using 3SLS, see section 10.5 for the SUR specification where all the explanatory variables were assumed to be exogenous. The first equation relates Growth (dly) to education (adult years schooling: yrt), the share of government consumption in GDP (gov), M2/GDP (m2y), Inflation (inf), Sachs-Warner measure of openness (swo), changes in the terms of trade (dtot), initial income (f pcy), dummy for 1980s (d80) and dummy for 1990s (d90). The second equation relates the Gini coefficient (gih) to education, M2/GDP, civil liberties index (civ), mean land Gini (mlg), mean land Gini interacted with a dummy for developing countries (mlgldc). The data contains 119 observations for 38 countries over the period 1965-1990, and can be obtained from

http://www. res. org. uk/economic/datasets/datasetlist. asp.

. estat firststage, forcenonrobust all First-stage regression summary statistics







Robust F(2, 69)

Prob > F













Shea’s partial R-squared




Partial R-sq.

Adj. Partial R-sq.







Minimum eigenvalue statistic = 5.31166

Critical Values

# of endogenous regressors:


Ho: Instruments are weak

# of excluded instruments:






2SLS relative bias

(not available)





2SLS Size of nominal 5% Wald test





LIML Size of nominal 5% Wald test





Education, government, M2/GDP, inflation, Sachs-Warner measure of openness, civil liberties index, mean land Gini, mean land Gini interacted with a dummy for developing countries(ldc) are assumed to be endogenous. Instruments include initial values of all variables (except land Gini and income), population, urban share, life expectancy, fertility, initial female literacy and democracy, arable area, dummies for oil and non-oil commodity exporters, and legal origin. Table 11.9 reports the 3SLS estimates of these two equations using the reg3 command in Stata. The results replicate those reported in Table 1 of Lundberg and Squire (2003, p. 334). Allowing for endogeneity, these results still show that openness enhances growth and education reduces inequality.


1. A heteroskedasticity-robust statistic is recommended especially if y2 has discrete characteristics.

2. Why 10? See the proof in Appendix 10.4 of Stock and Watson (2003).

3. This test is also known as the Durbin-Wu-Hausman test, following the work of Durbin (1954), Wu (1973) and Hausman (1978).

reg3 (Growth: dly = yrt gov m2y inf swo dtot f pcy d80 d90) (Inequality: gih = yrt m2y civ mlg mlgldc), exog(commod f civ f dem f dtot f flit f gov f inf f m2y f swo f yrt pop urb lex lfr marea oil legor fr legor ge legor mx legor sc legor uk) endog(yrt gov m2y inf swo civ mlg mlgldc)

Three-stage least-squares regression























Std. Err.


P> z

[95% Conf. Interval]












































f pcy







































































Endogenous variables: dly gih yrt gov m2y inf swo civ mlg mlgldc

Exogenous variables: dtot f pcy d80 d90 commod f civ f dem f dtot f flit f gov f inf f m2y f swo

f yrt pop urb lex lfr marea oil legor fr legor ge legor mx legor sc legor uk


1. The Inconsistency of OLS. Show that the OLS estimator of S in (11.14), which can be written as

SOLS = 2t=i Ptlt/Y-)’t=1 Pt

is not consistent for S. Hint: Write SOLS = S + ^ t=i pt(u2t — U2)/Y^t=i Pt, and use (11.18) to show that

plim Sols = S + (сті2 — CT22XS — в)/[an + ^22 — 2^].

2. When Is the IV Estimator Consistent? Consider equation (11.30) and let X3 and X4 be the only two other exogenous variables in this system.

(a) Show that a two-stage estimator which regresses y2 on Xi, X2 and X3 to get y2 = y2 + V2, and y3 on Xi, X2 and X4 to get y3 = y3 + V3, and then regresses yi on y2, y3 and Xi and

X2 does not necessarily yield consistent estimators. Hint: Show that the composite error is Є1 = (ui + 012^2 + «13^3) and T=1 htV2t = 0, because f=1 y2tV3t = 0. The latter does not hold because Л1=1 X3tv3t = 0. (This shows that if both y’s are not regressed on the same set of X’s, the resulting two stage regression estimates are not consistent).

(b) Show that the two-stage estimator which regresses y2 and y3 on X2, X3 and X4 to get

y2 = y2 + v2 and y3 = y3 + v3 and then regresses y1 on y2, y3 and X1 and X2 is not

necessarily consistent. Hint: Show that the composite error term e1 = u1 + a12v2 + a13v3 does not satisfy J2t=1 ^1tX1t = 0, since ^J=1 v2tX1t = 0 and ^T=1 v3tX1t = 0. (This shows that if one of the included X’s is not included in the first stage regression, then the resulting two-stage regression estimates are not consistent).

3. Just-Identification and Simple IV. If equation (11.34) is just-identified, then X2 is of the same dimension as Y1, i. e., both are T x g1. Hence, Z1 is of the same dimension as X, both of dimension T x (g1 + k1). Therefore, X’Z1 is a square nonsingular matrix of dimension (g1 + k1). Hence, (Z’1 X)-1 exists. Using this fact, show that 61t2SLS given by (11.36) reduces to (X’Z1)-1X’y1. This is exactly the IV estimator with W = X, given in (11.41). Note that this is only feasible if X1Z1 is square and nonsingular.

4. 2SLS Can Be Obtained as GLS. Premultiply equation (11.34) by X’ and show that the transformed

disturbances X ‘u1

(X’X)). Perform GLS on the resulting transformed equation and show that 61GLS is 612SLS, given by (11.36).

5. The Equivalence of 3SLS and 2SLS.

(a) Show that S3SLS given in (11.46) reduces to S2SLS, when (i) V is diagonal, or (ii) every equa­tion in the system is just-identified. Hint: For (i); show that V-1 0Px is block-diagonal with the i-th block consisting of Px/fin. Also, Z is block-diagonal, therefore, {Z1 [S’ -1 0 Px ]Z }-1 is block-diagonal with the i-th block consisting of aii(Z’iPXZfi)-1. Similarly, computing Z ‘[S-1 <8> Px]y, one can show that the *-th element of S3sls is (Z’PxZi) 1 Z’Pxyi = 51,2SLS. For (ii); show that Z’X is square and nonsingular under just-identification. There­fore, S12SLS = (X’Z1)-1X’y1 from problem 3. Also, from (11.44), we get

fi3SLS = {diag[ZlX](V-1 <g> (X’X)-1 )diag[X’Zi]}-1

{diag[ZlX](V-1 <g> (X’X)-1 )(IG 0 X’)y}.

Using the fact that Z’X is square, one can show that S1i3SLS = (X’Z1)-1X’y1.

(b) Premultiply the system of equations in (11.43) by (IG 0 Px) and let y* = (IG 0 Px)y, Z* = (IG 0 Px)Z and u* = (IG 0 Px)u, then y* = Z*6 + u*.Show that OLS on this transformed model yields 2SLS on each equation in (11.43). Show that GLS on this model yields 3SLS (knowing the true S) given in (11.45). Note that var(u*) = S 0 Px and its generalized inverse is S-1 0 Px. Use the Milliken and Albohali condition for the equivalence of OLS and GLS given in equation (9.7) of Chapter 9 to deduce that 3SLS is equivalent to 2SLS if Z*'(S-1 0 Px)Pz* = 0. Show that this reduces to the following necessary and sufficient condition a1jZ’Pz. = 0 for i = j, see Baltagi (1989). Hint: Use the fact that

Z * = diag[Px Z1] = diag[.Z1] and PZ* = diag[Pz_ ].

Verify that the two sufficient conditions given in part (a) satisfy this necessary and sufficient condition.

6. Consider the following demand and supply equations:

Q = a — bP + ui Q = c + dP + eW + fL + U2

where W denotes weather conditions affecting supply, and L denotes the supply of immigrant workers available at harvest time.

(a) Write this system in the matrix form given by equation (A.1) in the Appendix.

(b) What does the order-condition for identification say about these two equations?

(c) Premultiply this system by a nonsingular matrix F = [fj ], for i, j = 1, 2. What restrictions must the matrix F satisfy if the transformed model is to satisfy the same restrictions of the original model? Show that the first row of F is in fact the first row of an identity matrix, but the second row of F is not the second row of an identity matrix. What do you conclude?

7. Answer the same questions in problem 6 for the following model:

Q = a — bP + cY + dA + ui Q = e + fP + gW + hL + U2

where Y is real income and A is real assets.

8. Consider example (A.1) in the Appendix. Recall, that system of equations (A.3) and (A.4) are just-identified.

(a) Construct ф for the demand equation (A.3) and show that Аф = (0, —f)’ which is of rank

1 as long as f = 0. Similarly, construct ф for the supply equation (A.4) and show that

Аф = (—c, 0)’ which is of rank 1 as long as c = 0.

(b) Using equation (A.17), show how the structural parameters can be retrieved from the reduced form parameters. Derive the reduced form equations for this system and verify the above relationships relating the reduced form and structural form parameters.

9. Derive the reduced form equations for the model given in problem 6, and show that the structural parameters of the second equation cannot be derived from the reduced form parameters. Also, show that there are more than one way of expressing the structural parameters of the first equation in terms of the reduced form parameters.

10. Just-Identified Model. Consider the just-identified equation

yi = Zibi + ui

with W, the matrix of instruments for this equation of dimension T x £ where £ = g1 + k1 the dimension of Z1. In this case, W’Z1 is square and nonsingular.

(a) Show that the generalized instrumental variable estimator given below (11.41) reduces to the simple instrumental variable estimator given in (11.38).

(b) Show that the minimized value of the criterion function for this just-identified model is zero, i. e., show that (yi — Zibigy)’Pw(yi — Zi6i, iy) = 0.

(c) Conclude that the residual sum of squares of the second stage regression of this just-identified model is the same as that obtained by regressing yi on the matrix of instruments W, i. e., show that (yi — ZiSijy

Pz = PpWZl = PW, under just-identification.

11. The More Valid Instruments the Better. Let W and W2 be two sets of instrumental variables for the first structural equation given in (11.34). Suppose that Wi is spanned by the space of W2. Verify that the resulting IV estimator of S1 based on W2 is at least as efficient as that based on W1. Hint: Show that PW2W1 = W1 and that PW2 — Pw1 is idempotent. Conclude that the difference in the corresponding asymptotic covariances of these IV estimators is positive semi­definite. (This shows that increasing the number of legitimate instruments should improve the asymptotic efficiency of an IV estimator).

12. Testing for Over-Identification. In testing Ho; 7 = 0 versus H1; 7 = 0 in section 11.4, equation (11.47):

(a) Show that the second stage regression of 2SLS on the unrestricted model y1 = Z1S1+W*7+u1 with the matrix of instruments W yields the following residual sum of squares:

URSS * = y1 Pw У1 = y1 У1 — y1 Pw У1

Hint: Use the results of problem 10 for the just-identified case.

(b) Show that the second stage regression of 2SLS on the restricted model y1 = Z1S1 + u1 with the matrix of instruments W yields the following residual sum of squares:

RRSS * = y1 P2i y1 = y1 y1 — y1 P2i y1

where Z = PW Z1 and P2i = PW Z1(Z1 PW Z1)-1Z’1 PW. Conclude that RRSS * — URSS * yields (11.49).

(c) Consider the test statistic (RRSS * — URSS * )/a11 where a11 is given by (11.50) as the usual 2SLS residual sum of squares under Ho divided by T. Show that it can be written as Haus – man’s (1983) test statistic, i. e., TRft where R^ is the uncentered R2 of the regression of 2SLS residuals (y1 — Z1 hpSLS) on the matrix of all pre-determined variables W. Hint: Show that the regression sum of squares (y1 — Z1S1,2SLS)1 PW (y1 — Z161,2SLS) = (RRSS * — URSS *) given in (11.49).

(d) Verify that the test for Ho based on the GNR for the model given in part (a) yields the same TRU test statistic described in part (c).

13. Hausman’s Specification Test: OLS versus 2SLS. This is based on Maddala (1992, page 511). For the simple regression

yt = fat + ut t = 1, 2 . ..,T

where в is scalar and ut ~ IIN(0,<r2). Let wt be an instrumental variable for xt. Run xt on wt and get xt = Trwt + fit or xt = Xt + fit where Xt = nwt.

(a) Show that in the augmented regression yt = fixt + 7xt + et a test for 7 = 0 based on OLS from this regression yields Hausman’s test-statistic. Hint: Show that 7OLS = q/(1 — rXw) where

Подпись: r2 =(Ef=1 xtwt) /Ehw2Ehx2.

Next, show that var(7OLS) = var(eOLS)/rXw(1 — r‘2w). Conclude that

lOLS/var(7OLS) = ?2rL/[var(7OLS)(1 — rlw)]

is the Hausman (1978) test statistic m given in section 11.5.

(b) Show that the same result in part (a) could have been obtained from the augmented regression

yt = fat + + nt

where vt is the residual from the regression of xt on wt.

14. Consider the following structural equation: yi = a12y2 + a13y3 + ви^і + ei2X2 + Щ where y2 and y3 are endogenous and X1 and X2 are exogenous. Also, suppose that the excluded exogenous variables include X3 and X4.

(a) Show that Hausman’s test statistic can be obtained from the augmented regression:

yi = ai2y2 + «ІЗУЗ + eiiXi + ei2X2 + і2У2 + 7зУЗ + е1

where y2 and y3 are predicted values from regressing y2 and y3 on X = [Xi, X2,X3,X4j. Hausman’s test is equivalent to testing Ho; y2 = y3 = 0. See equation (11.54).

(b) Show that the same results in part (a) hold if we had used the following augmented regression:

yi = «і2У2 + «ізУз + eiiXi + в^2 + 72^2 + 7з^з + Пі

where v2 and D3 are the residuals from running y2 and y3 on X = [Xi, X2,X3,X4]. See equation (11.55). Hint: Show that the regressions in (a) and (b) have the same residual sum of squares.

15. For the artificial regression given in (11.55):

(a) Show that OLS on this model yields S1,ols = S1,iv = (Z1PWZ1)-1Z1PWyi. Hint: Yi — Yi = PWYi. Use the FWL Theorem to residual out these variables in (11.55) and use the fact that

Pw Z1 = [Pw Y1,0].

(b) Show that the var(^1OLS) = ’s11(Z’1 PWZ1)-1 where sr11 is the mean squared error of the OLS regression in (11.55). Note that when n = 0 in (11.55), IV estimation is necessary and sii underestimates a11 and will have to be replaced by (y1 — Z1S1,rv)1(y1 — Z1S1,rv)/T.

16. Recursive Systems. A recursive system has two crucial features: B is a triangular matrix and E is a diagonal matrix. For this special case of the simultaneous equations model, OLS is still consistent, and under normality of the disturbances still maximum likelihood. Let us consider a specific example:

yit + Yiixit + Y i2x2t = uit

Подпись: In this case, B Подпись: 1 0 в21 1 Подпись: is triangular and Подпись: v ii 0 0 V 22 Подпись: is assumed diagonal.

e21y1t + y2t + Y23x3t = u2t

(a) Check the identifiability conditions of this recursive system.

(b) Solve for the reduced form and show that y1t is only a function of the xt’s and u1t, while y2t is a function of the xt’s and a linear combination of u1t and u2t.

(c) Show that OLS on the first structural equation yields consistent estimates. Hint: There are no right hand side y’s for the first equation. Show that despite the presence of y1 in the second equation, OLS of y2 on y1 and x3 yields consistent estimates. Note that y1 is a function of u1 only and u1 and u2 are not correlated.

(d) Under the normality assumption on the disturbances, the likelihood function conditional on

the x’s is given by

L(B, Г, E) = (2n)-T/2BTE-T/2 exp(-1 ET=1 ut)

where in this two equation case u’t = (u1t, u2t). Since B is triangular, B = 1. Show that maximizing L with respect to B and Г is equivalent to minimizing Q = Et=1 u’tE-1ut. Conclude that when E is diagonal, E-1 is diagonal and Q = Et=1 u^/an + Et=1 ut/o22. Hence, maximizing the likelihood with respect to B and Г is equivalent to running OLS on each equation separately.

17. Hausman’s Specification Test: 2SLS Versus 3SLS. This is based on Holly (1988). Consider the two-equations model,

У1 = ay2 + в1Х1 + в2Х2 + U1

У2 = 7У1 + в3хз + U2

where y1 and y2 are endogenous; x1, x2 and x3 are exogenous (the y’s and the x’s are n x 1 vectors). The standard assumptions are made on the disturbance vectors u1 and u2. With the usual notation, the model can also be written as

У1 = Z1S1 + u1

У2 = Z2S2 + u2

The following notation will be used: S = 2SLS, S = 3SLS, and the corresponding residuals will be denoted as u and u, respectively.

(a) Assume that a7 =1. Show that the 3SLS estimating equations reduce to

cr11X ‘ЇЇ1 + cr12X ‘ЇЇ2 = 0 U12Z2 Px uq + d-22Z2 Px uu>2 = 0

where X = (x1,x2,x3), E = [a2j] is the structural form covariance matrix, and E-1 = ]

for i, j = 1, 2.

(b) Deduce that S2 = S2 and S1 = S1 — (u12/u22)(Z1 PxZ1)-1 Z1 Pxu2. This proves that the 3SLS estimator of the over-identified second equation is equal to its 2SLS counterpart. Also, the 3SLS estimator of the just-identified first equation differs from its 2SLS (or indirect least squares) counterpart by a linear combination of the 2SLS (or 3SLS) residuals of the over-identified equation, see Theil (1971).

(c) How would you interpret a Hausman-type test where you compare S1 and S1? Show that it is nR2 where R2 is the R-squared of the regression of u2 on the set of second stage regressors of both equations Z1 and Z2. Hint: See the solution by Baltagi (1989).

18. For the two-equation simultaneous model

Уи = в 12У2t + Y 11x1t + u1t

"^t = в 21У11 + Y 22x2t + 723 x3t + u2t






10 "


X’X =




X ‘Y =



Y’Y =











(a) Determine the identifiability of each equation with the aid of the order and rank conditions for identification.

(b) Obtain the OLS normal equations for both equations. Solve for the OLS estimates.

(c) Obtain the 2SLS normal equations for both equations. Solve for the 2SLS estimates.

(d) Can you estimate these equations using Indirect Least Squares? Explain.

19. Laffer (1970) considered the following supply and demand equations for Traded Money:

log(TM/P) = ao + ailog(RM/P) + a2log i + uui

log(TM/P) = во + !3ilog(Y/P) + ^2log i + eslog(S1) + ^4log(^2) + U2


TM = Nominal total trade money RM = Nominal effective reserve money Y = GNP in current dollars S2 = Degree of market utilization

i = short-term rate of interest

S1 = Mean real size of the representative economic unit (1939 = 100)

P = GNP price deflator (1958 = 100)

The basic idea is that trade credit is a line of credit and the unused portion represents purchasing power which can be used as a medium of exchange for goods and services. Hence, Laffer (1970) suggests that trade credit should be counted as part of the money supply. Besides real income and the short-term interest rate, the demand for real traded money includes log(S 1) and log(S2). S1 is included to capture economies of scale. As S1 increases, holding everything else constant, the presence of economies of scale would mean that the demand for traded money would decrease. Also, the larger S2, the larger the degree of market utilization and the more money is needed for transaction purposes.

The data are provided on the Springer web site as LAFFER. ASC. This data covers 21 annual observations over the period 1946-1966. This was obtained from Lott and Ray (1992). Assume that (TM/P) and i are endogenous and the rest of the variables in this model are exogenous.

(a) Using the order condition for identification, determine whether the demand and supply equa­tions are identified? What happens if you used the rank condition of identification?

(b) Estimate this model using OLS.

(c) Estimate this model using 2SLS.

(d) Estimate this model using 3SLS. Compare the estimates and their standard errors for parts (b), (c) and (d).

(e) Test the over-identification restriction of each equation.

(f) Run Hausman’s specification test on each equation basing it on OLS and 2SLS.

(g) Run Hausman’s specification test on each equation basing it on 2SLS and 3SLS.

20. The market for a certain good is expressed by the following equations:

Dt = ao — aiPt + a2Xt + uit (ai, a2 > 0)

St = во + ei Pt + u2t (ei > 0)

Dt = St = Qt

where Dt is the quantity demanded, St is the quantity supplied, Xt is an exogenous demand shift variable. (u1t, u2t) is an IID random vector with zero mean and covariance matrix £ = [aij] for i, j = 1, 2.

(a) Examine the identifiability of the model under the assumptions given above using the order and rank conditions of identification.

(b) Assuming the moment matrix of exogenous variables converge to a finite non-zero matrix, derive the simultaneous equation bias in the OLS estimator of в 1.

(c) If a12 = 0 would you expect this bias to be positive or negative? Explain.

21. Consider the following three equations simultaneous model


= a1 + в2У2 + Y1X1 + U1



= a2 + в1У1 + вз Уз + Y2X2 + U2



= a3 + Y3X3 + Y4X4 + Y 5X5 + u3


where the X’s are exogenous and the y’s are endogenous.

(a) Examine the identifiability of this system using the order and rank conditions.

(b) How would you estimate equation (2) by 2SLS? Describe your procedure step by step.

(c) Suppose that equation (1) was estimated by running y2 on a constant X2 and X3 and the resulting predicted y2 was substituted in (1), and OLS performed on the resulting model. Would this estimating procedure yield consistent estimates of a1, в2 and y1? Explain your answer.

(d) How would you test for the over-identification restrictions in equation (1)?

22. Equivariance of Instrumental Variables Estimators. This is based on Sapra (1997). For the struc­tural equation given in (11.34), let the matrix of instruments W be of dimension T x £ where £ > g1 + k1 as described below (11.41). Then the corresponding instrumental variable estimator of 61 given below (11.41) is 61jv(y1) = (Z’1 PWZ1)-1 Z’1 PWy1.

(a) Show that this IV estimator is an equivariant estimator of S1, i. e., show that for any linear transformation y = ay1 + Z1b where a is a positive scalar and b is an (£ x 1) real vector, the following relationship holds:

S1,iv (yl) = a81,iv (y1) + b.

(b) Show that the variance estimator

^2(y1) = (y1 – Z161,ivЫ)'(y1 – Z161,iv(y1))/T is equivariant for a2, i. e., show that <r2(yl) = a2<r2(y1).

23. Identification and Estimation of a Simple Two-Equation Model. This is based on Holly (1987). Consider the following two equation model

yt1 = a + /3yt2 + Ut1

yt2 = Y + yt1 + ut2

where the y’s are endogenous variables and the u’s are serially independent disturbances that are identically distributed with zero means and nonsingular covariance matrix £ = [aij ] where E(utiutj) = aij for i, j = 1, 2, and all t = 1, 2,…,T. The reduced form equations are given by

yt1 = пи + vt1 and yt2 = П21 + vt2

with £1 = [wjj ] where E (vtiv tj) = Uij for i, j = 1, 2 and all t = 1, 2,…,T.

(a) Examine the identification of this system of two equations when no further information is available.

(b) Repeat part (a) when a12 = 0.

(c) Assuming <7i2 = 0, show that вOLS, the OLS estimator of в in the first equation is not consistent.

(d) Assuming a12 = 0, show that an alternative consistent estimator of в is an IV estimator using zt = [(yt2 – У2) – (yti – yi)] as an instrument for yt2.

(e) Show that the IV estimator of в obtained from part (d) is also an indirect least squares estimator of в. Hint: See the solution by Singh and Bhat (1988).

24. Errors in Measurement and the Wald (1940) Estimator. This is based on Farebrother (1985). Let y* be permanent consumption and X* be permanent income, both are measured with error:

y* = вх* where yi = y* + ti and Xi = x* + ui for i = 1, 2,…,n.

Let X*, ti and ui be independent normal random variables with zero means and variances af, a2e and a2U, respectively. Wald (1940) suggested the following estimator of в: Order the sample by the xi, s and split the sample into two. Let (y1, X1) be the sample mean of the first half of the sample and (y2,X2) be the sample mean of the second half of this sample. Wald’s estimator of в is вш = (y2 — y1)/(X2 — X1). It is the slope of the line joining these two sample mean observations.

(a) Show that вw can be interpreted as a simple IV estimator with instrument

zi = 1 for Xi > median(X)

= —1 for Xi < median(X)

where median(X) is the sample median of X1 ,X2, …,Xn.

(b) Define wi = p2X* — t2ui where p2 = a2U/(a2U + a2t) and т2 = a2t/(a2U + a2t). Show that

E(Xiwi) = 0 and that wi ~ N (0,a2 aU/(a2 + °U))-

(c) Show that X* = т2Xi + wi and use it to show that

E(/?w/X1, …,Xn) = E (Pols/X1, …,Xn)= вт2.

Conclude that the exact small sample bias of вOLS and вш are the same.

25. Comparison of t-ratios. This is based on Holly (1990). Consider the two equations model

y1 = ay2 + Хв + U1 and y2 = yy1 + Хв + U2

where a and 7 are scalars, y1 and y2 are T x 1 and X is a T x (K — 1) matrix of exogenous variables. Assume that ui ~ N (0, o2It ) for i = 1, 2. Show that the t-ratios for Hf; a = 0 and H(; Y = 0 using aOLS and 7OLS are the same. Comment on this result. Hint: See the solution by Farebrother (1991).

26. Degeneration of Feasible GLS to 2SLS in a Limited Information Simultaneous Equations Model. This is based on Gao and Lahiri (2000). Consider a simple limited information simultaneous equations model,

y1 = Yy2 + u, (1)

y2 = Хв + v, (2)

where y1 , y2 are N x 1 vectors of observations on two endogenous variables. X is N x K matrix of predetermined variables of the system, and K > 1 such that (1) is identified. Each row of

(u, v) is assumed to be i. i.d. (0, £), and £ is p. d. In this case, Y2SLS = (y2PXy2) 1y2PXy1, where

image461 Подпись: u'u u'v v'u v'v

Px = X(X’X)-1X’. The residuals 7 = y1 — 72SLSy2 and 7 = My2, where M = IN — Px are used to generate a consistent estimate for £

Show that a feasible GLS estimate of 7 using £ degenerates to v2SLS.

27. Equality of Two IV Estimators. This is based on Qian (1999). Consider the following linear re­gression model:

Уі = x’iP + e = x’lifii + x2iP 2 + ^, * = 1 2,■■■,N, (1)

where the dimensions of x’i, x’1i and x’2i are 1 xK, 1 xK1 and 1 xK2, respectively, with K = K1+K2. xi may be correlated with ei, but we have instruments zi such that E(eiz’i) = 0 and E(e‘lz’i) = a2. Partition the instruments into two subsets: zi = (z1 i, z’2i), where the dimensions of zi, z1i, and z2i are L, L1 and L2, with L = L1 + L2. Assume that E(z1ixi) has full column rank (so L1 > K); this ensures that в can be estimated consistently using the subset of instruments z1i only or using the entire set zi = (z1 i, z2i). We also assume that (yi, x’i, z’i) is covariance stationary.

Define Pa = A(A’A)-1 A’ and Ma = I — Pa for any matrix A with full column rank. Let X = (x1,… ,xN)’, and similarly for X1, X2,y, Z1, Z2 and Z. Define X = P[Zl]X and в = (X’X)-1X’y, so that в is the instrumental variables (IV) estimator of (1) using Z1 as instruments. Similarly, define X = PZX and в = (X’X)-1 X’y, so that в is the IV estimator of (1) using Z as instruments.

Show that 71 = в1 if Z2M[Zl] [X1 — X2(X2P1X2)-1X2P1X1] = 0.

28. For the crime in North Carolina example given in section 11.6, replicate the results in Tables 11.2-11.6 using the data for 1987. Do the same using the data for 1981. Are there any notable differences in the results as we compare 1981 to 1987?

29. Spatial Lag Test with Equal Weights. This is based on Baltagi and Liu (2009). Consider the spatial lag dependence model described in Section 11.2.1, with the N x N spatial weight matrix W having zero elements across the diagonal and equal elements 1/ (N — 1) off the diagonal. The LM test for zero spatial lag, i. e., H0 : p = 0 versus H1 : p = 0, is given in Anselin (1988). This takes the form

LM =_____________ U’Wy/ (й’й/N)]2_____________ .

(WX/?y P X WXP/a2 + tr (W2 + W ‘W)

where Px = I — X (X’X) 1 X’, в is the restricted mle, which in this case is the least squares estimator of в. Similarly, <r2 is the corresponding restricted mle of a2, which in this case is the least squares residual sums of squares divided by N, i. e., (й’й/N), where й denotes the least squares residuals. Show that for the equal weight spatial matrix, this LM test statistic will always equal N/2 (N — 1) no matter what p is.

30. Growth and Inequality Reconsidered. For the Lundberg and Squire (2003) Growth and Inequality example considered in section 11.6.

(a) Estimate these equations using 3SLS and verify the results reported in Table 1 of Lundberg and Squire (2003, p. 334). These results show that openness enhances growth and education reduces inequality, see Table 11.9. How do these 3SLS results compare with 2SLS? Are the over-identification restrictions rejected by the data?

(b) Lundberg and Squire (2003) allow growth to enter the inequality equation, and inequality to enter the growth equation. Estimate these respecified equations using 3SLS and verify the results reported in Table 2 of Lundberg and Squire (2003, p. 336). How do these 3SLS results compare with 2SLS? Are the over-identification restrictions rejected by the data?

31. Married Women Labor Supply. Mroz (1987) questions the exogeneity assumption of the wife’s wage rate in a simple specification of married women labor supply. Using the PSID for 1975, his sample consists of 753 married white women between the ages of 30 and 60 in 1975, with 428 working at some time during the year. The wife’s annual hours of work (hours) is regressed on the logarithm of her wage rate (lwage); the nonwife income (nwifeinc); the wife’s age (age), her years of schooling (educ), the number of children less than six years old in the household (kidslt6), and the number of children between the ages of five and nineteen (kidsge6). The data set was obtained from the data web site of Wooldridge (2009).

(a) Replicate Table III of Mroz (1987, p. 769) which gives the descriptive statistics of the data.

(b) Replicate Table IV of Mroz (1987, p. 770) which runs OLS and 2SLS using a variety of instrumental variables for lwage. These instrumental variables are described in Table V of Mroz (1987, p. 771).

(c) Run the over-identification test for each 2sls regression in (b), as well as the diagnostics for the first stage regression.


This chapter is influenced by Johnston (1984), Kelejian and Oates (1989), Maddala (1992),

Davidson and MacKinnon (1993), Mariano (2001) and Stock and Watson (2003). Additional

references include the econometric texts cited in Chapter 3 and the following:

Anderson, T. W. and H. Rubin (1950), “The Asymptotic Properties of Estimates of the Parameters of a Single Equation in a Complete System of Stochastic Equations,” Annals of Mathematical Statistics, 21: 570-582.

Baltagi, B. H. (1989), “A Hausman Specification Test in a Simultaneous Equations Model,” Econometric Theory, Solution 88.3.5, 5: 453-467.

Baltagi, B. H. and L. Liu (2009), “Spatial Lag Test with Equal Weights,” Economics Letters, 104: 81-82.

Basmann, R. L. (1957), “A Generalized Classical Method of Linear Estimation of Coefficients in a Struc­tural Equation,” Econometrica, 25: 77-83.

Basmann, R. L. (1960), “On Finite Sample Distributions of Generalized Classical Linear Identifiability Tests Statistics,” Journal of the American Statistical Association, 55: 650-659.

Bekker, P. A. (1994), “Alternative Approximations to the Distribution of Instrumental Variable Estima­tors,” Econometrica 62: 657-681.

Bekker, P. A. and T. J. Wansbeek (2001), “Identification in Parametric Models,” Chapter 7 in Baltagi, B. H. (ed.) A Companion to Theoretical Econometrics (Blackwell: Massachusetts).

Bound, J., D. A. Jaeger and R. M. Baker (1995), “Problems with Instrumental Variables Estimation When the Correlation Between the Instruments and the Exogenous Explanatory Variable is Weak,” Journal of American Statistical Association 90: 443-450.

Cornwell, C. and W. N. Trumbull (1994), “Estimating the Economic Model of Crime Using Panel Data,” Review of Economics and Statistics, 76: 360-366.

Cragg, J. G. and S. G. Donald (1996), “Inferring the Rank of Matrix,” Journal of Econometrics, 76: 223-250.

Deaton, A. (1997), The Analysis of Household Surveys: A Microeconometric Approach to Development Policy (Johns Hopkins University Press: Baltimore).

Durbin, J. (1954), “Errors in Variables,” Review of the International Statistical Institute, 22: 23-32.

Farebrother, R. W. (1985), “The Exact Bias of Wald’s Estimator,” Econometric Theory, Problem 85.3.1, 1: 419.

Farebrother, R. W. (1991), “Comparison of t-Ratios,” Econometric Theory, Solution 90.1.4, 7: 145-146.

Fisher, F. M. (1966), The Identification Problem in Econometrics (McGraw-Hill: New York).

Gao, C. and K. Lahiri (2000), “Degeneration of Feasible GLS to 2SLS in a Limited Information Simul­taneous Equations Model,” Econometric Theory, Problem 00.2.1, 16: 287.

Gronau, R. (1973), “The Effects of Children on the Housewife’s Value of Time,” Journal of Political Economy, 81: S168-199.

Haavelmo, T. (1944), “The Probability Approach in Econometrics,” Supplement to Econometrica, 12.

Hall, A. (1993), “Some Aspects of Generalized Method of Moments Estimation,” Chapter 15 in Handbook of Statistics, Vol. 11 (North Holland: Amsterdam).

Hansen, L. (1982), “Large Sample Properties of Generalized Method of Moments Estimators,” Econo – metrica, 50: 646-660.

Hausman, J. A. (1978), “Specification Tests in Econometrics,” Econometrica, 46: 1251-1272.

Hausman, J. A. (1983), “Specification and Estimation of Simultaneous Equation Models,” Chapter 7 in Griliches, Z. and Intriligator, M. D. (eds.) Handbook of Econometrics, Vol. I (North Holland: Amsterdam).

Holly, A. (1987), “Identification and Estimation of a Simple Two-Equation Model,” Econometric Theory, Problem 87.3.3, 3: 463-466.

Holly, A. (1988), “A Hausman Specification Test in a Simultaneous Equations Model,” Econometric Theory, Problem 88.3.5, 4: 537-538.

Holly, A. (1990), ’’Comparison of t-ratios,” Econometric Theory, Problem 90.1.4, 6: 114.

Kapteyn, A. and D. G. Fiebig (1981), “When are Two-Stage and Three-Stage Least Squares Estimators Identical?,” Economics Letters, 8: 53-57.

Koopmans, T. C. and J. Marschak (1950), Statistical Inference in Dynamic Economic Models (John Wiley and Sons: New York).

Koopmans, T. C. and W. C. Hood (1953), Studies in Econometric Method (John Wiley and Sons: New York).

Laffer, A. B., (1970), “Trade Credit and the Money Market,” Journal of Political Economy, 78: 239-267.

Lott, W. F. and S. C. Ray (1992), Applied Econometrics: Problems with Data Sets (The Dryden Press: New York).

Lundberg, M. and L. Squire (2003), “The Simultaneous Evolution of Growth and Inequality,” The Economic Journal, 113: 326-344.

Manski, C. F. (1995), Identification Problems in the Social Sciences (Harvard University Press: Cam­bridge)

Mariano, R. S. (2001), “Simultaneous Equation Model Estimators: Statistical Properties and Practical Implications,” Chapter 6 in Baltagi, B. H. (ed.) A Companion to Theoretical Econometrics (Black­well: Massachusetts).

Mroz, T. A. (1987), “The Sensitivity of an Empirical Model of Married Women’s Hours of Work to Economic and Statistical Assumptions,” Econometrica, 55: 765-799.

Nelson, C. R. and R. Startz (1990), “The Distribution of the Instrumental Variables Estimator and its t-Ratio when the Instrument is a Poor One,” Journal of Business, 63: S125-140.

Sapra, S. K. (1997), “Equivariance of an Instrumental Variable (IV) Estimator in the Linear Regression Model,” Econometric Theory, Problem 97.2.5, 13: 464.

Singh, N. and A N. Bhat (1988), “Identification and Estimation of a Simple Two-Equation Model,” Econometric Theory, Solution 87.3.3, 4: 542-545.

Staiger, D. and J. Stock (1997), “Instrumental Variables Regression With Weak Instruments,” Econo – metrica, 65: 557-586.

Stock, J. H. and M. W. Watson (2003), Introduction to Econometrics (Addison Wesley: Boston).

Theil, H. (1953), “Repeated Least Squares Applied to Complete Equation Systems,” The Hague, Central Planning Bureau (Mimeo).

Theil, H. (1971), Principles of Econometrics (Wiley: New York).

Wald, A. (1940), “Fitting of Straight Lines if Both Variables are Subject to Error,” Annals of Mathe­matical Statistics, 11: 284-300.

Wooldridge, J. M. (1990), “A Note on the Lagrange Multiplier and F Statistics for Two Stage Least Squares Regression,” Economics Letters, 34: 151-155.

Wooldridge, J. M. (2009), Introductory Econometrics: A Modern Approach (South-Western: Ohio).

Wu, D. M. (1973), “Alternative Tests of Independence Between Stochastic Regressors and Disturbances,” Econometrica, 41: 733-740.

Zellner, A. and Theil, H. (1962), “Three-Stage Least Squares: Simultaneous Estimation of Simultaneous Equations,” Econometrica, 30: 54-78.

Appendix: The Identification Problem Revisited: The Rank Condition of Identification

In section 11.1.2, we developed a necessary but not sufficient condition for identification. In this section we emphasize that model identification is crucial because only then can we get meaningful estimates of the parameters. For an under-identified model, different sets of param­eter values agree well with the statistical evidence. As Bekker and Wansbeek (2001, p. 144) put it, preference for one set of parameter values over other ones becomes arbitrary. Therefore, “Scientific conclusions drawn on the basis of such arbitrariness are in the best case void and in the worst case dangerous.” Manski (1995, p. 6) also warns that “negative identification findings imply that statistical inference is fruitless. It makes no sense to try to use a sample of finite size to infer something that could not be learned even if a sample of infinite size were available.” Consider the simultaneous equation model
which displays the whole set of equations at time t. B is G x G, Г is G x K and ut is G x 1. B is square and nonsingular indicating that the system is complete, i. e., there are as many equations as there are endogenous variables. Premultiplying (A.1) by B-l and solving for yt in terms of the exogenous variables and the vector of errors, we get

Подпись: (A.2)yt = nxt + vt t = 1,2,…,T.

where П = —B-1Г, is G x K, and vt = B-lut. Note that if we premultiply the structural model in (A.1) by an arbitrary nonsingular G x G matrix F, then the new structural model has the same reduced form given in (A.2). In this case, each new structural equation is a linear combination of the original structural equations, but the reduced form equations are the same. One idea for identification, explored by Fisher (1966), is to note that (A.1) is completely defined by B, Г and the probability density function of the disturbances p(ut). The specification of the structural model which comes from economic theory, imposes a lot of zero restrictions on the B and Г coefficients. In addition, there may be cross-equations or within equation restrictions. For example, constant returns to scale of the production function, or homogeneity of a demand equation, or symmetry conditions. In addition, the probability density function of the disturbances may itself contain some zero covariance restrictions. The structural model given in (A.1) is identified if these restrictions are enough to distinguish it from any other structural model. This is operationalized by proving that the only nonsingular matrix F which results in a new structural model that satisfies the same restrictions on the original model is the identity matrix. If after imposing the restrictions, only certain rows of F resemble the corresponding rows of an identity matrix, up to a scalar of normalization, then the corresponding equations of the system are identified. The remaining equations are not identified. This is the same concept of taking a linear combination of the demand and supply equations and seeing if the linear combination is different from demand or supply. If it is, then both equations are identified. If it looks like demand but not supply, then supply is identified and demand is not identified. Let us look at an example.

Example (A.1): Consider a demand and supply equations with

Qt = a — bPt + cYt + uit (A.3)

Qt = d + ept + fWt + u2t (A.4)

image464 image465 Подпись: yt Подпись: Qt Pt Подпись: (A.5)

where Y is income and W is weather. Writing (A.3) and (A.4) in matrix form (A.1), we get

There are two zero restrictions on Г. The first is that income does not appear in the supply equation and the second is that weather does not appear in the demand equation. Therefore, the order condition of identification is satisfied for both equations. In fact, for each equation, there is one excluded exogenous variable and only one right hand side included endogenous variable. Therefore, both equations are just-identified. Let F = [fj] for i, j = 1,2, be a nonsingular matrix. Premultiply this system by F. The new matrix B is now FB and the new matrix Г is now FГ. In order for the transformed system to satisfy the same restrictions as the original model, FB must satisfy the following normalization restrictions:

Also, Fr should satisfy the following zero restrictions

-/21c + /220 = 0 fn0 – /12/ = 0 (A.7)

Since c = 0, and / = 0, then (A.7) implies that f21 = f12 = 0. Using (A.6), we get f11 = f22 = 1. Hence, the only nonsingular F that satisfies the same restrictions on the original model is the identity matrix, provided c = 0 and / = 0. Therefore, both equations are identified.

Example (A.2): If income does not appear in the demand equation (A.3), i. e., c = 0, the model looks like

Qt = a – bPt + U1t (A.8)

Qt = d + ept + /Wt + u2t (A.9)

In this case, only the second restriction given in (A.7) holds. Therefore, / = 0 implies /12 = 0, however /21 is not necessarily zero without additional restrictions. Using (A.6), we get /11 = 1 and /21 + /22 = 1. This means that only the first row of F looks like the first row of an identity matrix, and only the demand equation is identified. In fact, the order condition for identification is not met for the supply equation since there are no excluded exogenous variables from that equation but there is one right hand side included endogenous variable. See problems 6 and 7 for more examples of this method of identification.

Example (A.3): Suppose that u ~ (0, Q) where Q = £ ® IT, and £ = [a^] for i, j = 1,2. This example shows how a variance-covariance restriction can help identify an equation. Let us take the model defined in (A.8), (A.9) and add the restriction that a12 = a21 = 0. In this case, the transformed model disturbances will be Fu (0, Q*), where Q* = £* ® IT, and £* = F£F1. In

fact, since £ is diagonal, F£F’ should also be diagonal. This imposes the following restriction on the elements of F:

/11a 11/21 + /12a22/22 = 0 (A.10)

But, /11 = 1 and /12 = 0 from the zero restrictions imposed on the demand equation, see example 4. Hence, (A.10) reduces to a11/21 = 0. Since a11 = 0, this implies that /21 = 0, and the normalization restriction given in (A.6), implies that /22 = 1. Therefore, the second equation is also identified.

Example (A.4): In this example, we demonstrate how cross-equation restrictions can help iden­tify equations. Consider the following simultaneous model

У1 = a + by2 + cx1 + U1 (A.11)

У2 = d + ey1 + /x1 + gx2 + U2 (A.12)

and add the restriction c = /. It can be easily shown that the first equation is identified with /11 = 1, and /12 = 0. The second equation has no zero restrictions, but the cross-equation restriction c = / implies:

-c/11 – //12 = – c/21 – //22

Using c = /, we get

But, the first equation is identified with /ц = 1 and f12 = 0. Hence, (A.13) reduces to /21+/22 = 1, which together with the normalization condition – f21 b + f22 = 1, gives f21(b + 1) = 0. If b = -1, then f21 = 0 and f22 = 1. The second equation is also identified provided b = -1.

Alternatively, one can look at the problem of identification by asking whether the structural parameters in B and Г can be obtained from the reduced form parameters. It will be clear from the following discussion that this task is impossible if there are no restrictions on this simultaneous model. In this case, the model is hopelessly unidentified. However, in the usual case where there are a lot of zeros in B and Г, we may be able to retrieve the remaining non-zero coefficients from П. More rigorously, П = —B_1Г, which implies that

ВП + Г = 0 (A.14)


AW = 0 where A = [B, Г] and W’ = [П, Ik] (A.15)

For the first equation, this implies that

a’^W = 0 where a’1 is the first row of A. (A.16)

W is known (or can be estimated) and is of rank K. If the first equation has no restrictions on its structural parameters, then a’1 contains (G + K) unknown coefficients. These coefficients satisfy K homogeneous equations given in (A.16). Without further restrictions, we cannot solve for (G + K) coefficients (a1) with only K equations. Let ф denote the matrix of R zero restrictions on the first equation, i. e., аф = 0. This together with (A.16) implies that

a’1[W, ф] = 0 (A.17)

and we can solve uniquely for a’1 (up to a scalar of normalization) provided the

rank [W, ф] = G + K — 1 (A.18)

Economists specify each structural equation with the left hand side endogenous variable having the coefficient one. This normalization identifies one coefficient of a^, therefore, we require only (G + K — 1) more restrictions to uniquely identify the remaining coefficients of a1. [W, ф] is a (G + K) x (K + R) matrix. Its rank is less than any of its two dimensions, i. e., (K + R) > (G + K — 1), which results in R > G — 1, or the order condition of identification. Note that this is a necessary but not sufficient condition for (A.18) to hold. It states that the number of restrictions on the first equation must be greater than the number of endogenous variables minus one. If all R restrictions are zero restrictions, then it means that the number of excluded exogenous plus the number of excluded endogenous variables should be greater than (G — 1). But the G endogenous variables are made up of the left hand side endogenous variable y1, the g1 right hand side included endogenous variables Y1, and (G — g1 — 1) excluded endogenous variables. Therefore, R > (G — 1) can be written as k2 + (G — g1 — 1) > (G — 1) which reduces to k2 > g1, which was discussed earlier in this chapter.

The necessary and sufficient condition for identification can now be obtained as follows: Using (A.1) one can write
and from the first definition of identification we make the transformation FAzt = Fut, where F is a G x G nonsingular matrix. The first equation satisfies the restrictions аф = 0, which can be rewritten as 1 Аф = 0, where 1 is the first row of an identity matrix IG. F must satisfy the restriction that (first row of FA)ф = 0. But the first row of FA is the first row of F, say fl, times A. This means that f1 (Аф) = 0. For the first equation to be identified, this condition on the transformed first equation must be equivalent to 1 Аф = 0, up to a scalar constant. This holds if and only if f[ is a scalar multiple of 1, and the latter condition holds if and only if the rank (Аф) = G — 1. The latter is known as the rank condition for identification.

Example (A.5): Consider the simple Keynesian model given in example 1. The second equation is an identity and the first equation satisfies the order condition of identification, since It is the excluded exogenous variable from that equation, and there is only one right hand side included endogenous variable Yt. In fact, the first equation is just-identified. Note that

Подпись:Подпись: A = [B, Г](A.20)

and ф for the first equation consists of only one restriction, namely that It is not in that equation, or 712 = 0. This makes ф1 = (0,0,0,1), since ajф = 0 gives 712 = 0. From (A.20),

Aф = (71 2,y22)’ = (0, — 1) and the rank ^ф) = 1 = G — 1. Hence, the rank condition holds for the first equation and it is identified. Problem 8 reconsiders example (A.1), where both equations are just-identified by the order condition of identification and asks the reader to show that both satisfy the rank condition of identification as long as c = 0 and f = 0.

image471 image472

The reduced form of the simple Keynesian model is given in equations (11.4) and (11.5). In fact,

Therefore, the structural parameters of the consumption equation can be retrieved from the reduced form coefficients. However, what happens if we replace П by its OLS estimate Потз? Would the solution in (A.22) lead to two estimates of (a, в) or would this solution lead to a unique estimate? In this case, the consumption equation is just-identified and the solution in (A.22) is unique. To show this, recall that

Подпись: (A.23)Подпись: (A.24)7T1 2 — mci/mii; and n22 — myi/mii

Solving for в, using (A.22), one gets
в = nf 12/п 22 = mci/myi


(A.24) and (A.25) lead to a unique solution because equation (11.2) gives

myi = mci + mu (A.26)

In general, we would not be able to solve for the structural parameters of an unidentified equation in terms of the reduced form parameters. However, when this equation is identified, replacing the reduced form parameters by their OLS estimates would lead to a unique estimate of the structural parameters, only if this equation is just-identified, and to more than one estimate depending upon the degree of over-identification. Problem 8 gives another example of the just-identified case, while problem 9 considers a model with one unidentified and another over-identified equation.

Подпись: B Подпись: 1 -в 1 -S Подпись: and Г Подпись: —a —Y Подпись: (A.27)

Example (A.6): Equations (11.13) and (11.14) give an unidentified demand and supply model with

Подпись: П = —B-1r = П11 = aS — ^в . П 21 a — Y Подпись: /(S — в) Подпись: (A.28)

The reduced form equations given by (11.16) and (11.17) yield

Note that one cannot solve for (a, в) nor (y, S) in terms of (n11,n21) without further restric­tions.


Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>