# The F Test

In this section we shall consider the test of the null hypothesis Q’fi = c against the alternative hypothesis Q’fi Ф c when it involves more than one constraint (that is, q > 1).

We shall derive the Ftest as a simple transformation of the likelihood ratio test. Suppose that the likelihood function is in general given by L(£, в), where (is a sample and 0 is a vector of parameters. Let the null hypothesis be H0: 0 Є S0 where S0 is a subset of the parameter space 0 and let the alternative hypothesis be tf,: в Є St where is another subset of 0. Then the likelihood ratio test is defined by the following procedure:

maxL({, 0)

Reject H0 if Я = *g3b – < g, (1.5.5)

max L(q, 0)

eeSoUSi

where g is chosen so as to satisfy P(A < g) = a for a given significance level a. The likelihood ratio test may be justified on several grounds: (1) It is intui­tively appealing because of its connection with the Neyman-Pearson lemma.

(2) It is known to lead to good tests in many cases. (3) It has asymptotically optimal properties, as shown by Wald (1943).

Now let us obtain the likelihood ratio test of Q’fi = c against Q’0Ф c in Model 1 with normality. When we use the results of Section 1.4.1, the numer­ator likelihood becomes

max L = (2ло2)~т/2ехр [— 0.5<т~2(у — — Xfl)] (1.5.6)

Qfi-e

– (2rio2)~T/2e~T/2.

Because in the present case S0 U S is the whole parameter space of Д the maximization in the denominator of (1.5.5) is carried out without constraint. Therefore the denominator likelihood is given by max L = (2xd2)~T/2exp [—0.5d~2(y — ХД)'(у — Xfi)]

= (2nd2)~T/2erT/2.     Hence, the likelihood ratio test of Q’fi = c is

where g is chosen as in (1.5.5).

We actually use the equivalent test

Reject Q’fi = c if T~KS^-0-5-9)

4 b(P)

where h is appropriately chosen. Clearly, (1.5.8) and (1.5.9) can be made equivalent by putting h=[(T— K)/q]{g~VT — 1). The test (1.5.9) is called the Ftest because under the null hypothesis q is distributed as F(q, T— K)—the F distribution with q and T — К degrees of freedom—as we shall show next. From (1.4.3) and (1.4.5) we have

S(fi) – S(fi) = (Q’fi-c)’[Q’(X’XrlQr4Q’fi-c)- (1.5.Ю)

But since (1.5.1) holds even if Q is a matrix, we have by Theorem 1 of Appendix 2 Sjfi)-S(fi) л a2 Хя  Because this chi-square variable is independent of the chi-square variable given in (1.5.3) by an application of Theorem 5 of Appendix 2 (or by the independence of u and fi), we have by Theorem 4 of Appendix 2

We shall now give an alternative motivation for the test statistic (1.5.12). For this puipose, consider the general problem of testing the null hypothesis #0: в — 60, where в = (0j, 62У is a two-dimensional vector of parameters. Suppose we wish to construct a test statistic on the basis of an estimator 0 that is distributed as N(60, V) under the null hypothesis, where we assume V to be a known diagonal matrix for simplicity. Consider the following two tests of #0:

Reject H0 if (в-в0)'(в-в0)>с (1.5.13)

and

where c and d are determined so as to make the probability of Type I error equal to the same given value for each test. That (1.5.14) is preferred to

(1.5.13) can be argued as follows: Let a and a be the variances of 0, and 02, respectively, and suppose a<a. Then a deviation of 0, from 0lojprovides a greater reason for rejecting H0 than the same size deviation of 02 from 020 because the latter could be caused by variability of 02 rather than by falseness of #0. The test statistic (1.5.14) precisely incorporates this consideration.

The test (1.5.14) is commonly used for the general case of a ^-vector 0 and a general variance-covariance matrix V. Another attractive feature of the test is that

(0 — 0o)’V-,(0 — 0O) ~(1.5.15)

Applying this test to the test of the hypothesis (1.4.1), we readily see that the test should be based on

(QQg-cy[Q'(X’X)-1Q]-1(Q?–c)

Л01 vl. J.lOJ

This test can be exactly valid only when a2 is known and hence is analogous to the standard normal test based on (1.5.2). Thus the Ftest statistic (1.5.12) may be regarded as a natural adaptation of (1.5.16) to the situation where a2 must be estimated. (For a rigorous mathematical discussion of the optimal proper­ties of the F-test, the reader is referred to SchefK, 1959, p. 46.)

By comparing (1.5.4) and (1.5.12) we immediately note that if q = 1 (and therefore Q’ is a row vector) the F statistic is the square of the t statistic. This fact clearly indicates that if q = 1 the г test should be used rather than the Ftest because a one-tail test is possible only with the t test.

As stated earlier, (1.5.1) holds asymptotically even if u is not normal. Be­cause a2 = fl’fi/(T — K) converges to a2 in probability, the linear hypothesis can be tested without assuming normality of u by using the fact that qr is asymptotically distributed as **. Some people prefer to use the F test in this situation. The remark in note 6 applies to this practice.

The F statistic rj given in (1.5.12) takes on a variety of forms as we insert a variety of specific values into Q and c. As an example, consider the case where fiis partitioned as)?’ = , fi2), where 0X is a ATrvector and 02 is a AVvector

such that Kx+ K2 = K, and the null hypothesis specifies fi2 = fi2 and leaves/1, unspecified. This hypothesis can be written in the form Q’0 = c by putting Q’ = (О, I), where 0 is the K2 X Kx matrix of zeros and I is the identity matrix of size K2, and by putting c = fi2. Inserting these values into (1.5.12) yields T — K (j}2 — /ЩО, ІХХ’ХГЧО, І)ГЧД ~ 0i) V K2 fi’fi ~F(K2,T-K).

We can write (1.5.17) in a more suggestive form. Partition X as X = (X!, X2) conformably with the partition offiand define M, = I — X,(XiX,)-1XJ. Then by Theorem 13 of Appendix 1 we have

[(0, IXX’X)-‘(0,1)’]’1 = X2M, X2. (1.5.18)  Therefore, using (1.5.18), we can rewrite (1.5.17) as

Of particular interest is the further special case of (1.5.19), where X, = 1, so that /?, is a scalar coefficient on the first column of X, which we assume to be the vector of ones (denoted by 1), and where = 0. Then M, becomes L = I — T~lY. Therefore, using (1.2.13), we have

A = (X;LX2r>X;Ly, (1.5.20)  so (1.5.19) can now be written as

Using the definition of Я2 given in (1.2.9), we can rewrite (1.5.21) as T—K R2

v = Y=TT=&~F{K~1’t~k)’

since fi’fi = y’Ly — y’LX2(X2LX2)-1X£Ly because of Theorem 15 of Appen­dix 1. The value of statistic (1.5.22) usually appears in computer printouts and is commonly interpreted as the test statistic for the hypothesis that the popula­tion counterpart of R2 is equal to 0.