# Akaike Information Criterion

The Akaike information criterion in the context of the linear regression model was mentioned in Section 2.1.5. Here we shall consider it in a more general setting. Suppose we want to test the hypothesis (4.5.2) on the basis of the likelihood ratio test (4.5.3). It means that we choose L(0) over Да) if

LRT > d, (4.5.28)

where d is determined so that P[LRT > dL{a)] — c, a certain prescribed constant such as 5%. However, if we must choose one model out of many competing models Да,), Да2),. . . , this classical testing procedure is at best time consuming and at worst inconclusive. Akaike (1973) proposed a simple procedure that enables us to solve this kind of model-selection prob­lem. Here we shall give only a brief account; for the full details the reader is referred to Akaike’s original article or to Amemiya (1980a).

We can write a typical member of the many competing models Цаі), Ца2),… as Да). Akaike proposed the loss function

Щвo, a)—– у J [log Д0O) dx, (4.5.29)

where a is treated as a constant in the integration. Because W(Q0, а) ё W(0o, 0O) = 0, (4.5.29) can serve as a reasonable loss function that is to be minimized among the competing models. However, because W depends on the unknown parameters 0O, a predictor of W must be used instead. After rather complicated algebraic manipulation, Akaike arrived at the following simple predictor of W, which he called the Akaike Information Criterion (AIC):

AIC—- — у log Ца) + Ц, (4.5.30)

where p is the dimension of the vector a. The idea is to choose the model for which AIC is smallest. Akaike’s explanation regarding why AIC (plus a certain omitted term that does not depend on a or p) may be regarded as a good predictor of W is not entirely convincing. Nevertheless, many empirical re­searchers think that AIC serves as a satisfactory guideline for selecting a model.