# Comparing Biased and Unbiased Estimators

Suppose we are given two estimators в1 and в2 of в where the first is unbiased and has a large variance and the second is biased but with a small variance. The question is which one of these two estimators is preferable? в1 is unbiased whereas в2 is biased. This means that if we repeat the sampling procedure many times then we expect в1 to be on the average correct, whereas в2 would be on the average different from в. However, in real life, we observe only one sample. With a large variance for в1, there is a great likelihood that the sample drawn could result in a в1 far away from в. However, with a small variance for в2, there is a better chance of getting a в2 close to в. If our loss function is Ь(в, в) = (в — в)2 then our risk is

R(d, в) = E[L(9, в)] = Е(в — в)2 = MSE(B)

= E [g — E (?) + E(?) — в]2 = var(?) + (Bias(?))2.

 For the simple regression without a constant Yi = pXi + ui, with u ~ IID(°,a2).

(a) Derive the OLS estimator of в and find its variance.

(b) What numerical properties of the OLS estimators described in problem 1 still hold for this model?

(c) derive the maximum likelihood estimator of в and a2 under the assumption ui ~ IIN(0, a2).

(d) Assume a2 is known. Derive the Wald, LM and LR tests for H0; в =1 versus Hi; в = 1.

 The Hilderth-Lu (1960) Search Procedure: p is between —1 and 1. Therefore, this procedure searches over this range, i. e., using values of p say between —0.9 and 0.9 in intervals of 0.1. For each p, one computes the regression in (5.35) and reports the residual sum of squares corresponding to that p. The minimum residual sum of squares gives us our choice of p and the corresponding regression gives us the estimates of а, в and a2. One can refine this procedure around the best p found in the first stage of the search. For example, suppose that p = 0.6 gave

 OLS Variance Is Biased Under Heteroskedasticity. For the simple linear regression with het­eroskedasticity of the form E(u2) = a2 = bx2 where b > 0, show that E(s2/ П=і x2) understates

the variance of вoLs which is

 Using the U. S. gasoline data in Chapter 4, problem 15, given in Table 4.2 estimate the following

model with a six year lag on prices:

, [QMG RGNP CAR

ogVcar)t_Yl + 72ogv pop )t + 73ogVPOP)

_ 6 fPMG

+l4^ i=oWi og ( PGNP ) t.

 From the first m observations compute the vector of recursive residuals w1 using the base constructed in step 1. Also, compute the vector of recursive residuals wt from the last m observations. The maximum m can be is (T — k)/2.

 Plosser, Schwert and White (1982) (PSW) Differencing Test

The differencing test is a general test for misspecification (like Hausman’s (1978) test, which will be introduced in the simultaneous equation chapter) but for time-series data only. This test compares OLS and First Difference (FD) estimates of в. Let the differenced model be

urss/(T — к) = [(y — x)'(y — x) — (y — X)’X2(X2X2)-1×2(y — x)]/(T — k) (.

The denominator is the OLS estimate of a2 from (8.76) which tends to aO as T ^ ж. Hence (rF-statistic ^ x2 ). In small samples, use the F-statistic.