# The Estimator and a Fundamental Decomposition

The introduction provides the essence of GMM. In this section, we discuss the form of the estimator in more detail, and also describe how estimation effects a fundamental decomposition of the population moment condition.

Recall that the GMM estimator of 0 o based on the population moment condi­tion in Assumption 3 is:

QT = argmin 0Є0 QT(0), (11.8)

where QT(0) = gT(0)’WTyT(0). For this approach to make sense, QT(0) must be a meaningful measure of distance and hence the weighting matrix must possess certain properties. Specifically, WT must be positive semi-definite for finite T so that QT(0) is both nonnegative and equals zero if yT(0) = o. However, semi­definiteness leaves open the possibility that QT(0) = o without yT(0) = o. Since all our statistical analysis is based on asymptotic theory, it is only necessary to rule out this possibility in the limit. Accordingly, WT is assumed to satisfy:

Assumption 6. Properties of the Weighting Matrix. WT is a posi­tive semi-definite matrix which converges in probability to the positive definite matrix of constants W.

If Assumption 5 holds, and in most cases of interest it will, then the first order conditions for this minimization imply dQT (0T)/30 = 0. This condition yields6

T1 WtT11 f(vt, 0T)j = 0 (11.9)

However, there is typically no closed form solution for 0T and so the estimator must be obtained via numerical optimization techniques.7

This characterization of the estimator yields an interesting interpretation of GMM. Inspection of (11.9) reveals that the GMM estimator based on E[f(vt, 00)] = 0 can be interpreted as a method of moments estimator based on

{E[df(vt, 00)/Э0′]}’WE[f(vt, 00)] = 0. (11.10)

Therefore, GMM is an MM estimator based on the information that the p linear combinations of E[ f(vt, 00)] in (11.10) are zero. Notice that if the assumptions behind Lemma 1 hold then these p linear combinations are linearly independent and so the MM interpretation emphasizes the fundamental connection between identification and estimation – that is, the p parameters are only locally identified if the estimation is based on p linearly independent equations. If p = q then (11.10) is equivalent to E[f(vt, 00)] = 0, and we note parenthetically that this means the weighting matrix plays no role in the analysis. However, if q > p then there is a difference between information used in estimation and the original population moment condition.

The advantage of this MM interpretation is that it makes explicit the in­formation used in GMM information. However, it is not particularly amenable to the characterization of what information is left out because (11.10) is a sys­tem of p < q equations. Sowell (1996) shows that this problem can be circum­vented if (11.10) is interpreted in terms of the transformed moment condition W 1/2E[ f (vt, 00)]. Equation (11.10) can then be rewritten as

F(00)’W1/2E [ f(vt, 00)] = 0, (11.11)

where F(00) = W1/2E[cf(vt, 00)/Э0′]. Equation (11.11) states that W1/2E[f(vt, 00)] lies in the null space of F(0 0)’. Sowell (1996) shows that the information about 00 in (11.10) is equivalent to the information in

F(00)[F(00)’F(00)]-1F(00)’W1/2E[f(vt, 00)] = 0. (11.12)

Formally, (11.12) states that the least squares projection of W 1/2E[f (vt, 00)] on to the column space of F(00) is zero, and thereby places

rank{F(00)[F(00)’F(00)]-1F(0 0)’} = p

restrictions on the transformed population moment condition. Sowell (1996) refers to (11.12) as the identifying restrictions.

The advantage of this alternative characterization is that (11.12) is a system of q equations and so it is now possible to characterize the part of E[f(vt, 00)] = 0 which is ignored in estimation. By definition, this remainder is

{Iq – F(00)[F(00)’F(00)]-1F(00Y}W1/2E[f(vt, 00)] = 0, (11.13)

which represents what Hansen referred to as the overidentifying restrictions in his original article. Equation (11.13) states that the projection of W 1/2E[f(vt, 00)] on to the orthogonal complement of F (00) is zero, and thereby places q – p restrictions on the transformed population moment condition.

Equations (11.12)-(11.13) indicate that GMM estimation effects a fundamental decomposition of the q x 1 population moment condition into p identifying restrictions upon which estimation is based, and q – p overidentifying restric­tions which are ignored in estimation. Notice also that these two sets of restrictions are orthogonal due to the projection matrix structure.

The roles of the two sets of restrictions are reflected in their sample counter­parts. Since the identifying restrictions represent the information used in estima­tion, their sample analogs are satisfied at 0T by construction. In contrast, the overidentifying restrictions are ignored in estimation and so their sample analog is not satisfied. However, they can be used to give a useful interpretation to the GMM minimand. From (11.9), it follows that

W1/2T-1 £ f(Vt, Pt) = {Iq – FT(0t)[Ft(0t)’Ft(0r)]-1 FT(0r)’}

t=1

x W T/2T -1 X f (Vt, 0T) (11.14)

where FT(0) = T T=1 df(vt, 0)/Э0′. Therefore, QT(0T) can be interpreted as a measure

of how far the sample is from satisfying the overidentifying restrictions.