Let X be distributed Nn(0, In) – that is, X is n-variate, standard, normally distributed. Consider the linear transformations Y = BX, where B is a k x n matrix of constants, and Z = CX, where C is an m x n matrix of constants. It follows from Theorem 5.4 that
Then Y and Z are uncorrelated and therefore independent if and only if CBT = O. More generally we have
Theorem 5.6: Let X be distributed Nn(0, In), and consider the linear transformations Y = b + BX, where b is a k x 1 vector and B a k x n matrix of constants, and Z = c + CX, where cis anm x 1 vector and C anm x n matrix of constants. Then Y and Z are independent if and only if BCT = O.
This result can be used to set forth conditions for independence of linear and quadratic transformations of standard normal random vectors: Read More
The following conditions guarantee that the first – and second-order conditions for a maximum hold.
Assumption 8.1: The parameter space © is convex and в0 is an interior point of ©. The likelihood function L n (в) is, with probability 1, twice continuously differentiable in an open neighborhood ©0 of в0, and, for i, i2 = 1, 2, 3,…,m,
Theorem 8.2: Under Assumption 8.1,
Proof: For notational convenience I will prove this theorem for the univariate parameter case m = 1 only. Moreover, I will focus on the case that Z = (zT, •••, zT )T is a random sample from an absolutely continuous distribution with density f (z^0).
1 n f
E [ln( L n (в ))/n] = -£> [ln( f( Zj |в))] = Ы(/^в ))f(z^o)dz,
n j=i J
It fol... Read More
For the case x є К, |x | < 1, it follows from Taylor’s theorem that ln(1 + x) has the series representation
ln(1 + x) = ^(-1)*-1 xk /k. (III.18)
I will now address the issue of whether this series representation carries over if we replace x by i ■ x because this will yield a useful approximation of exp(i ■ x),
which plays a key role in proving central limit theorems for dependent random variables. See Chapter 7.
If (III.18) carries over we can write, for arbitrary integers m,
log(1 + i ■ x) = ^2(—1)k-1ikxk/k + i ■ mn
= ( — 1)2k-1i 2kx 2k / (2k)
+ (-1)2k-1-1i2k-1 x2k-1/(2k – 1) + i ■ mn
= J2 (—1)k-1 x 2k/(2k)
+ i J2(—1)k-1 x2k-1/(2k – 1) + i ■ mn. (III.19)
On the other hand, it follows from (III.17) that 12
log(1 + i ■ x) = ln(1 + x2)... Read More
Roll a die, and let the outcome be Y. Define the random variable X = 1 if Y is even, and X = 0 if Y is odd. The expected value of Y is E[Y] = (1 + 2 + 3 + 4 + 5 + 6)/6 = 3.5. But what would the expected value of Y be if it is revealed that the outcome is even: X = 1? The latter information implies that Y is 2, 4, or 6 with equal probabilities 1 /3; hence, the expected value of Y, conditional on the event X = 1, is E[Y|X = 1] = (2 + 4 + 6)/3 = 4. Similarly, if it is revealed that X = 0, then Y is 1, 3, or, 5 with equal probabilities 1 /3; hence, the expected value of Y, conditional on the event X = 0, is E[Y|X = 0] = (1 + 3 + 5)/3 = 3. Both results can be captured in a single statement:
E [Y | X] = 3 + X. (3.1)
In this example the conditional probability of Y = y, given, X... Read More
220.127.116.11. Consistency of M-Estimators
Chapter 5 introduced the concept of a parameter estimator and listed two desirable properties of estimators: unbiasedness and efficiency Another obviously
See Appendix II.
desirable property is that the estimator gets closer to the parameter to be estimated if we use more data information. This is the consistency property:
Definition 6.5: An estimator в of a parameter в, based on a sample of size n, is called consistent ifplimn^XlB = в.
Theorem 6.10 is an important tool in proving consistency of parameter estimators. A large class of estimators is obtained by maximizing or minimizing an objective function of the form(1/n) nj=1 g(Xj, в), where g, Xj, and в are the same as in Theorem 6.10... Read More
The Gaussian elimination of a nonsquare matrix is similar to the square case except that in the final result the upper-triangular matrix now becomes an echelon matrix:
Definition I.10: Anm x n matrix Uis an echelon matrix if, for i = 2,…, m, the first nonzero element of row i is farther to the right than the first nonzero element of the previous row i — 1.
For example, the matrix
is an echelon matrix, and so is
Theorem I.8 can now be generalized to
Theorem I.11: For each matrix A there exists a permutation matrix P, possibly equal to the unit matrix I, a lower-triangular matrix L with diagonal elements all equal to 1, and an echelon matrix U such that PA = LU... Read More