MODES OF CONVERGENCE

Let us first review the definition of the convergence of a sequence of real numbers.

DEFINITION 6.1.1 Asequence of real numbers {aj, n = 1, 2, . . . , is said to converge to a real number a if for any e > 0 there exists an integer N such that for all n > N we have

(6.1.1) |an — a| < e.

We write a„ —» a as n —> °° or lim,,^» an = a (n —> °° will be omitted if it is obvious from the context).

Now we want to generalize Definition 6.1.1 to a sequence of random variables. If a„ were a random variable, we could not have (6.1.1) exacdy, because it would be sometimes true and sometimes false. We could only talk about the probability of (6.1.1) being true. This suggests that we should modify the definition in such a way that the conclusion states that

(6.1.1) holds with a probability approaching 1 as л goes to infinity. Thus we have

DEFINITION 6.1.2 (convergence in probability). A sequence of ran­dom variables {Xn}, n = 1, 2, . . . , is said to converge to a random variable X in probability if for any e > 0 and 8 > 0 there exists an integer N such that for all n > N we have P(Xn — X|< e) > 1 — 8. We write Xn A X as n —> 00 or plim„ = X. The last equality reads “the probability limit of

Xn is X.” (Alternatively, the if clause may be paraphrased as follows: if lim P(Xn — X < e) = 1 for any e > 0.)

Unlike the case of the convergence of a sequence of constants, for which only one mode of convergence is sufficient, we need two other modes of convergence, convergence in mean square and convergence in distribution, for a sequence of random variables. There is still another mode of conver­gence, almost sure convergence, but we will not use it here. A definition can be found in any of the aforementioned books.

DEFINITION 6.1.3 (convergence in mean square) A sequence {X„} is said to converge to X in mean square if lim„^oo E {Xn — X)2 = 0. We write X„“ X.

DEFINITION 6.1.4 (convergence in distribution) A sequence {Xn} is said to converge to X in distribution if the distribution function Fn of Xn converges to the distribution function F of X at every continuity point of F. We write Xn A X, and we call F the limit distribution of (XJ. If {XJ and {Yn} have the same limit distribution, we write Xn = Yn.

The following two theorems state that convergence in mean square implies convergence in probability, which, in turn, implies convergence in distribution.

M P

THEOREM 6.1.1 (Chebyshev) Xn -> X => Xn -» X. THEOREM 6.1.2   Theorem 6.1.1 is deduced from the following inequality due to Chebyshev, which is useful on its own:

where g(-) is any nonnegative continuous function. To prove Theorem

о

6.1.1, take g(x) to be x and take Xn of (6.1.2) to be Xn — X. Chebyshev’s inequality follows from the simple result: g(x)fn(x)dx > є fn(x)dx,

where fn(x) is the density function of Xn and S = {x g(x’) > e2). Here we have assumed the existence of the density for simplicity, but inequality

(6.1.2) is true for any sequence of random variables, provided that Eg(Xn) exists. The following two theorems are very useful in proving the conver­gence of a sequence of functions of random variables.

theorem 6.1.3 Let X„ be a vector of random variables with a fixed finite number of elements. Let g be a function continuous at a constant vector point a. Then X„ —» a => g(Xn) —> g(ct).

THEOREM 6.1.4 (Slutsky) If X„ A X and Yn A a, then

(i) X n+ Fn A X + a,

(ii) X„FnAaX,

(iii) (Xn/Yn) A X/a, provided a Ф 0.

We state without proof the following generalization of the Slutsky theo­rem. Suppose that g is a continuous function except for finite disconti­nuities, plim Yin = a bi= 1,2,…,/, and {Xin}, і = 1,2, ,K, converge

jointly to {Xj in distribution. Then the limit distribution of g(Xln, X2n, . . . , XKn, Yin, Y2n,. – . , Yjn) is the same as the distribution of g(Xb X2, . . . , XK, ab a2,. . . , a. j). Here the joint convergence of {Xin} to {Xj is an important necessary condition.