# Cram6r-Rao Lower Bound

The Cram6r-Rao lower bound gives a useful lower bound (in the matrix sense) for the variance-covariance matrix of an unbiased vector estimator.4 In this section we shall prove a general theorem that will be applied to Model 1 with normality in the next two subsections.  Theorem 1.3.1 (Cramer-Rao). Let z be an n-component vector of random variables (not necessarily independent) the joint density of which is given by L(z, в), where в is a АГ-component vector of parameters in some parameter space 6. Let 0(z) be an unbiased estimator of в with a finite variance-covar­iance matrix. Furthermore, assume that L(z, в) and 6(z) satisfy d2 log L _ d log L d log L двдв’ Ь дв дв’ ’

Note: d log L/дв is a АГ-vector, so that 0 in assumption A is a vector ofК zeroes. Assumption C means that the left-hand side is a positive definite matrix. The integral in assumption D is an «-tuple integral over the whole domain of the Euclidean «-space because z is an «-vector. Finally, I in assumption D is the identity matrix of size K.

Then we have for в Є 0 (1.3.8)

Proof. Define Р = Е(в-Є)(в-ву, Q = E(\$- в)(д log L/дв’), and R = E(d log Ь/дв)(д log L/дв’). Then (1.3.9)

because the left-hand side is a variance-covariance matrix. Premultiply both sides of (1.3.9) by [I, — QR~1 ] and postmultiply by [I, — QR-1 ]’. Then we get P – QR-’Q’ > 0,

where R 1 can be defined because of assumption C. But we have    (1.3.11)

= I by assumption D.

Therefore (1.3.8) follows from (1.3.10), (1.3.11), and assumption B.

The inverse of the lower bound, namely, —Ed2 log L/двдв’, is called the information matrix. R. A. Fisher first introduced this term as a certain mea­sure of the information contained in the sample (see Rao, 1973, p. 329).

The Cramer-Rao lower bound may not be attained by any unbiased esti­mator. If it is, we have found the best unbiased estimator. In the next section we shall show that the maximum likelihood estimator of the regression pa­rameter attains the lower bound in Model 1 with normality. Assumptions A, B, and D seem arbitrary at first glance. We shall now inquire into the significance of these assumptions. Assumptions A and В are equivalent to

respectively. The equivalence between assumptions A and A’ follows from   (1.3.12)

The equivalence between assumptions В and B’ follows from (1.3.13) [

Furthermore, assumptions A’, B’, and D are equivalent to

because the right-hand sides of assumptions A", B", and D’ are 0,0, and I, respectively. Written thus, the three assumptions are all of the form  (1.3.14)

in other words, the operations of differentiation and integration can be inter­changed. In the next theorem we shall give sufficient conditions for (1.3.14) to hold in the case where в is a scalar. The case for a vector в can be treated in essentially the same manner, although the notation gets more complex.

Theorem 1.3.2. If (i) <3/(z, в)/дв is continuous in в є 0 and z, where 9 is an open set, (ii) //(z, 0) dz exists, and (iii) /|d/(z, 6)/dedz for all

в Є 0, then (1.3.14) holds. Proof. We have, using assumptions (i) and (ii),

where в* is between в and 0 + h. Next, we can write the last integral as / = fл +1 a, where A is a sufficiently large compact set in the domain of z and A is its complement. But we can make fA sufficiently small because of (i) and sufficiently small because of (iii).