Cram6r-Rao Lower Bound

The Cram6r-Rao lower bound gives a useful lower bound (in the matrix sense) for the variance-covariance matrix of an unbiased vector estimator.4 In this section we shall prove a general theorem that will be applied to Model 1 with normality in the next two subsections.

Подпись: (A) image011

Theorem 1.3.1 (Cramer-Rao). Let z be an n-component vector of random variables (not necessarily independent) the joint density of which is given by L(z, в), where в is a АГ-component vector of parameters in some parameter space 6. Let 0(z) be an unbiased estimator of в with a finite variance-covar­iance matrix. Furthermore, assume that L(z, в) and 6(z) satisfy

image012

d2 log L _ d log L d log L двдв’ Ь дв дв’ ’

Note: d log L/дв is a АГ-vector, so that 0 in assumption A is a vector ofК zeroes. Assumption C means that the left-hand side is a positive definite matrix. The integral in assumption D is an «-tuple integral over the whole domain of the Euclidean «-space because z is an «-vector. Finally, I in assumption D is the identity matrix of size K.

Then we have for в Є 0

image013(1.3.8)

Proof. Define Р = Е(в-Є)(в-ву, Q = E($- в)(д log L/дв’), and R = E(d log Ь/дв)(д log L/дв’). Then

image014(1.3.9)

because the left-hand side is a variance-covariance matrix. Premultiply both sides of (1.3.9) by [I, — QR~1 ] and postmultiply by [I, — QR-1 ]’. Then we get

Подпись: (1.3.10)P – QR-’Q’ > 0,

where R 1 can be defined because of assumption C. But we have

image016

image017
image018 image019

(1.3.11)

= I by assumption D.

Therefore (1.3.8) follows from (1.3.10), (1.3.11), and assumption B.

The inverse of the lower bound, namely, —Ed2 log L/двдв’, is called the information matrix. R. A. Fisher first introduced this term as a certain mea­sure of the information contained in the sample (see Rao, 1973, p. 329).

The Cramer-Rao lower bound may not be attained by any unbiased esti­mator. If it is, we have found the best unbiased estimator. In the next section we shall show that the maximum likelihood estimator of the regression pa­rameter attains the lower bound in Model 1 with normality.

image020

Assumptions A, B, and D seem arbitrary at first glance. We shall now inquire into the significance of these assumptions. Assumptions A and В are equivalent to

respectively. The equivalence between assumptions A and A’ follows from

image021

image022 image023

(1.3.12)

The equivalence between assumptions В and B’ follows from

Подпись: 1 dL 8L (1.3.13)

image025

[

Furthermore, assumptions A’, B’, and D are equivalent to

because the right-hand sides of assumptions A", B", and D’ are 0,0, and I, respectively. Written thus, the three assumptions are all of the form

image026

image027

(1.3.14)

in other words, the operations of differentiation and integration can be inter­changed. In the next theorem we shall give sufficient conditions for (1.3.14) to hold in the case where в is a scalar. The case for a vector в can be treated in essentially the same manner, although the notation gets more complex.

Theorem 1.3.2. If (i) <3/(z, в)/дв is continuous in в є 0 and z, where 9 is an open set, (ii) //(z, 0) dz exists, and (iii) /|d/(z, 6)/dedz for all

в Є 0, then (1.3.14) holds.

image028

Proof. We have, using assumptions (i) and (ii),

where в* is between в and 0 + h. Next, we can write the last integral as / = fл +1 a, where A is a sufficiently large compact set in the domain of z and A is its complement. But we can make fA sufficiently small because of (i) and sufficiently small because of (iii).

Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>