Statistical Issues: A Practical Approach to Core Questions
The hypothesis testing problem is often presented as one of deciding between two hypotheses: the hypothesis of interest (the null H0) and its complement (the alternative HA). For the purpose of the exposition, consider a test problem pertaining to a parametric model (Y P0), i. e. the case where the data generating process (DGP) is determined up to a finite number of unknown real parameters 0 £ 0, where 0 refers to the parameter space (usually a vector space), Y is the sample space, P0 is the family of probability distributions on Y Furthermore, let Y denote the observations, and 00 the subspace of 0 compatible with H0.
A statistical test partitions the sample space into two subsets: a set consistent with H0 (the acceptance region), and its complements whose elements are viewed as inconsistent with H0 (the rejection region, or the critical region). This may be translated into a decision rule based on a test statistic S(Y): the rejection region is defined as the numerical values of the test statistic for which the null will be rejected.
Without loss of generality, we suppose the critical region has the form S(Y) > c. To obtain a test of level a, c must be chosen so that the probability of rejecting the null hypothesis P0[S(Y) > c] when H0 is true (the probability of a type I error) is not greater than a, i. e. we must have:
sup P0[S(Y) > c] < a. (23.1)
Further, the test has size a if and only if
sup P0[S(Y) > c] = a. (23.2)
To solve for c in (23.1) or (23.2), it is necessary to extract the finite-sample distribution of S(Y) when the null is true. Typically, S(Y) is a complicated function of the observations and the statistical problem involved is often intractable. More importantly, it is evident from the definitions (23.1)-(23.2) that, in many cases of practical interest, the distribution of S(Y) may be different for different parameter values. When the null hypothesis completely fixes the value of 0 (i. e. 00 is a point), the hypothesis is called a simple hypothesis. Most hypotheses encountered in practice are composite, i. e. the set 0O contains more than one element. The null may uniquely define some parameters, but almost invariably some other parameters are not restricted to a point-set. In the context of composite hypotheses, some unknown parameters may appear in the distribution of S(Y). Such parameters are called nuisance parameters.
When we talk about an exact test, it must be understood that attention is restricted to level-correct critical regions, where (23.1) must hold for a given finite sample size, for all values of the parameter 0 compatible with the null. Consequently, in carrying out an exact test, one may encounter two problems. The first one is to extract the analytic form of the distribution of S(Y). The second one is to maximize the rejection probability over the relevant nuisance parameter space, subject to the level constraint. We will see below that the first problem can easily be solved when Monte Carlo test techniques are applicable. The second one is usually more difficult to tackle, and its importance is not fully recognized in econometric practice.
A reasonable solution to both problems often exists when one is dealing with large samples. Whereas the null distribution of S(Y) may be complicated and/or may involve unknown parameters, its asymptotic null distribution in many common cases has a known form and is nuisance-parameter-free (e. g., a normal or chi-square distribution). The critical point may conveniently be obtained using asymptotic arguments. The term approximate critical point is more appropriate here, since we are dealing with asymptotic levels: the critical values which yield the desired size a for a given sample size can be very different from these approximate values obtained through an asymptotic argument. For sufficiently large sample sizes, the standard asymptotic approximations are expected to work well. The question is, and will remain, how large is large? To illustrate this issue, we next consider several examples involving commonly used econometric methods. We will demonstrate, by simulation, that asymptotic procedures may yield highly unreliable decisions, with empirically relevant sample sizes. The problem, and our main point, is that finite sample accuracy is not merely a small sample problem.