In parametric estimation we can use two methods.
(1) Distribution-specific method. In the distribution-specific method, the distribution is assumed to belong to a class of functions that are characterized by a fixed and finite number of parameters—for example, normal—and these parameters are estimated.
(2) Distribution-free method. In the distribution-free method, the distribution is not specified and the first few moments are estimated.
In nonparametric estimation we attempt to estimate the probability distribution itself. The estimation of a probability distribution is simple for a discrete random variable taking a few number of values but poses problems for a continuous random variable. For example, suppose we want to estimate the density of the height of a Stanford male student, assuming that it is zero outside the interval [4, 7]. We must divide this interval into Ъ/d small intervals with length d and then estimate the ordinate of the density function over each of the small intervals by the number of students whose height falls into that interval divided by the sample size n. The difficulty of this approach is characterized by a dilemma: if d is large, the approximation of a density by a probability step function cannot be good, but if d is small, many intervals will contain only a small number of observations unless n is very large. Nonparametric estimation for a continuous random variable is therefore useful only when the sample size is very large. In this book we shall discuss only parametric estimation. The reader who wishes to study nonparametric density estimation should consult Silverman (1986).
Inherent problems exist in ranking estimators, as illustrated by the following example.
P=0 P= P= P=l
Population: X = 1 with probability p,
= 0 with probability 1 — p.
Sample: (Xi, X2).
Estimators: T = (Xj + X2)/2
S = Xi
W = y2.
In Figure 7.1 we show the probability step functions of the three estimators for four different values of the parameter p.
This example shows two kinds of ambiguities which arise when we try to rank the three estimators.
(1) For a particular value of the parameter, say, p = %, it is not clear which of the three estimators is preferred.
(2) T dominates W for p = 0, but W dominates T for p = y2.
These ambiguities are due to the inherent nature of the problem and should not be lightly dealt with. But because we usually must choose one estimator over the others, we shall have to find some way to get around the ambiguities.