# MOMENTS

4.1 EXPECTED VALUE

We shall define the expected value of a random variable, first, for a discrete random variable in Definition 4.1.1 and, second, for a continuous random variable in Definition 4.1.2.

definition 4.1.1 Let X be a discrete random variable taking the value xt with probability Р(хг), і = 1, 2, … . Then the expected value (expectation or mean) of X, denoted by EX, is defined to be EX = Z“=1x, P(x,) if the series converges absolutely. We can write EX = X+XjP(x,) + Х_хгР(хг), where in the first summation we sum for і such that x, > 0 and in the second summation we sum for і such that хг < 0. If X+x, P(x,) = °° and Z_x, P(Xj) = —°°, we say EX does not exist. If X+ = °° and X_ is finite, then we say EX = °°. If X_ = —oo and X+ is finite, then we say EX = —

The expected value has an important practical meaning. If X is the payoff of a gamble (that is, if you gain xt dollars with probability P(x,)), the expected value signifies the amount you can expect to gain on the average. For example, if a fair coin is tossed and we gain one dollar when a head comes up and nothing when a tail comes up, the expected value of this gamble is 50 cents. It means that if we repeat this gamble many times we will gain 50 cents on the average. We can formalize this statement as follows. Let Xi be the payoff of a particular gamble made for the iih time. Then the average gain from repeating this gamble n times is n л”= ]Хг, and it converges to EX in probability. This is a consequence of

Theorem 6.2.1. For the exact definition of convergence in probability, see Definition 6.1.2.

The quantity те_1Е"=1Хг is called the sample mean. (More exacdy, it is the sample mean based on a sample of size n.) EX is sometimes called the population mean so that it may be distinguished from the sample mean. We shall learn in Chapter 7 that the sample mean is a good estimator of the population mean.

Coming back to the aforementioned gamble that pays one dollar when a head comes up, we may say that the fair price of the gamble is 50 cents. This does not mean, however, that everybody should pay exactly 50 cents to play. How much this gamble is worth depends upon the subjective evaluation of the risk involved. A risk taker may be willing to pay as much as 90 cents to gamble, whereas a risk averter may pay only 10 cents. The decision to gamble for c cents or not can be thought of as choosing between two random variables Xj and X2, where Xj takes value 1 with probability У2 and 0 with probability У2 and X2 takes value c with probability 1. More generally, decision making under uncertainty always means choos­ing one out of a set of random variables X (d) that vary as d varies within the decision set D. Here X (d) is the random gain (in dollars) that results from choosing a decision d.

Choosing the value of d that maximizes EX(d) may not necessarily be a reasonable decision strategy. To illustrate this point, consider the follow­ing example. A coin is tossed repeatedly until a head comes up, and 2і; dollars are paid if a head comes up for the first time in the ith toss. The payoff of this gamble is represented by the random variable X that takes the value 2* with probability 2~г. Hence, by Definition 4.1.1, EX = °°. Obvi­ously, however, nobody would pay 00 dollars for this gamble. This example is called the “St. Petersburg Paradox,” because the Swiss mathematician Daniel Bernoulli wrote about it while visiting St. Petersburg Academy.

One way to resolve this paradox is to note that what one should maxi­mize is not EX itself but, rather, EU(X), where U denotes the utility function. If, for example, the utility function is logarithmic, the real worth of the St. Petersburg gamble is merely £ log X = log 2 • Х“=1і(У2)г = log 4, the utility of gaining four dollars for certainty. By changing the utility function, one can represent various degrees of risk-averseness. A good, simple exposition of this and related topics can be found in Arrow (1965).

Not all the decision strategies can be regarded as the maximization of

 figure 4.1 A positively skewed density

EU(X) for some U, however. For example, an extremely risk-averse person may choose the d that maximizes minX(d), where minX means the mini­mum possible value X can take. Such a person will not undertake the St. Petersburg gamble for any price higher than two dollars. Such a strategy is called the minimax strategy, because it means minimizing the maximum loss (loss may be defined as negative gain). We can think of many other strategies which may be regarded as reasonable by certain people in certain situations.

definition 4.1.2 Let X be a continuous random variable with density /(x). Then, the expected value of X, denoted by EX, is defined to be EX = /“oo xf (x)dx if the integral is absolutely convergent. If Jo xf (x)dx = °° and /_„о xf(x)dx = — oo, we say the expected value does not exist. If Jo xf (x)dx = oo and /-a, xf(x)dx is finite, we write EX = °°. If xf(x)dx = —00 and /о xf(x)dx is finite, we write EX = —°°.

Besides having an important practical meaning as the fair price of a gamble, the expected value is a very important characteristic of a prob­ability distribution, being a measure of its central location. The other important measures of central location are mode and median. The mode is a value of x for which / (x) is the maximum, and the median m is defined by the equation P(X < m) = ‘/2. If the density function f(x) is bell-shaped and symmetric around x = |x, then |x = EX = m = mode (X). If the density is positively skewed as in Figure 4.1, then mode (X) < m < EX. The three measures of central location are computed in the following examples.

example 4.1.1 /(x) = 2x~3 for x > 1. Then EX = /“ 2x~2dx = 2. The median от must satisfy У2 = Jf 2x~3dx = —от-2 + 1. Therefore от = V2~. The mode is clearly 1.

EXAMPLE 4.1.2 f (x) = x 2 for x > 1. Then EX = /“ x~ldx = °°. Since У2 = JT x 2dx = + 1, we have m = 2. The mode is again 1. Note

that the density of Example 4.1.2 has a fatter tail (that is, the density converges more slowly to 0 in the tail) than that of Example 4.1.1, which has pushed both the mean and the median to the right, affecting the mean much more than the median.

Theorems 4.1.1 and 4.1.2 show a simple way to calculate the expected value of a function of a random variable.

THEOREM 4.1.1 Let X be a discrete random variable taking value x, with probability P(Xi), і = 1, 2, … , and let ф(-) be an arbitrary function. Then

00

(4.1.1) – Еф(Х) = X Ф(*і)-Р(*і)-

і= 1

Proof. Define Y = ф(Х). Then Y takes value ф(хг) with probability P{xt). Therefore (4.1.1) follows from Definition 4.1.1. □

THEOREM 4.1.2 Let X be a continuous random variable with density/ (x) and let ф(-) be a function for which the integral below can be defined. Then

(4.1.2) £ф(Х) = І ф (x)f(x)dx.

J -00

We shall not prove this theorem, because the proof involves a level of analysis more advanced than that of this book. If ф(-) is continuous, differentiable, and monotonic, then the proof is an easy consequence of approximation (3.6.15). LetT = ф(Х) and denote the density of Y by g(y). Then we have

(4.1.3) EY = I yg(j)dy = lim Y. yg(y)Ay = lim Хф(х)/(х)Дх

о

ф (x)f(x)dx.

Theorem 4.1.1 can be easily generalized to the case of a random variable obtained as a function of two other random variables.

THEOREM 4.1.3 Let (X, Y) be a bivariate discrete random variable taking value (xi} y}) with probability P(xt, yj), i, j = 1, 2, . . . , and let ф(-, •) be an arbitrary function. Then

00 oo

(4.1.4) Еф(Х, Y) = X X ФС*i. yj)p(xv %)•

i= 1 j= 1

The following is a similar generalization of Theorem 4.1.2, which we state without proof.

THEOREM 4.1.4 Let (X, T) be a bivariate continuous random variable with joint density function f(x, y), and let ф(-, •) be an arbitrary function. Then

Гоо foo

(4.1.5) £ф(Х, Y) = ф(x, y)f(x, y)dxdy.

J —00 J —00

Note that given f(x, y), £ф(Х) can be computed either directly from

(4.1.5) above or by first obtaining the marginal density/(x) by Theorem

3.4.1 and then using Theorem 4.1.2. The same value is obtained by either procedure.

The following three theorems characterize the properties of operator E. Although we have defined the expected value so far only for a discrete or a continuous random variable, the following theorems are true for any random variable, provided that the expected values exist.

THEOREM 4.1.5 If a is a constant, Ea = a.

THEOREM 4.1.6 If X and Y are random variables and a and 3 are constants, E(aX + (ЗУ) = оіЕХ + 11EY.

theorem 4.1.7 If X and Y are independent random variables, EXY = EXEY.

The proof of Theorem 4.1.5 is trivial. The proofs of Theorems 4.1.6 and

4.1.7 when (X, Y) is either discrete or continuous follow easily from Defini­tions 4.1.1 and 4.1.2 and Theorems 4.1.3 and 4.1.4.

Theorem 4.1.7 is very useful. For example, let X and Y denote the face numbers showing in two dice rolled independendy. Then, since EX = EY = 7/2, we have EXY = 49/4 by Theorem 4.1.7. Calculating EXY directly from Definition 4.1.3 without using this theorem would be quite time-con­suming.

Theorems 4.1.6 and 4.1.7 may be used to evaluate the expected value of a mixture random variable which is partly discrete and partly continu­ous. Let X be a discrete random variable taking value хг with probability

P(xi), і = 1, 2…………… Let Y be a continuous random variable with density

f(y). Let W be a binary random variable taking two values, 1 and 0, with probability p and 1 — p, respectively, and, furthermore, assume that W is independent of either X or Y. Define a new random variable Z = WX + (1 — W)Y. Another way to define Z is to say that Z is equal to X with probability p and equal to Y with probability 1 — p. A random variable such as Z is called a mixture random variable. Using Theorems 4.1.6 and 4.1.7, we have EZ = EWEX + E( 1 – W)EY. But since EW = p from Definition 4.1.1, we get EZ = pEX + (1 — p)EY. We shall write a gener­alization of this result as a theorem.

THEOREM 4.1.8 Let X be a mixture random variable taking discrete value xt, і = 1, 2, . . . , n, with probability pi and a continuum of values in interval [a, b according to density f(x): that is, if [a, b D [jq, лг2], P(xl — X ^ X2) = J^f(x)dx. Then EX = £”=1хг/г + Jba xf (x)dx. (Note that we must have YT^pi + faf(x)dx = 1.)

The following example from economics indicates another way in which a mixture random variable may be generated and its mean calculated.

EXAMPLE 4.1.В Suppose that in a given year an individual buys a car if and only if his annual income is greater than 10 (ten thousand dollars) and that if he does buy a car, the one he buys costs one-fifth of his income. Assuming that his income is a continuous random variable with uniform density defined over the interval [5, 15], compute the expected amount of money he spends on a car.

Let X be his income and let Y be his expenditure on a car. Then Y is related to X by

(4.1.6) Y = 0 if 5 < X < 10,

= — if 10 < X < 15.

5

Clearly, Y is a mixture random variable that takes 0 with probability x/% and a continuum of values in the interval [2, 3] according to the density f(y) = У2. Therefore, by Theorem 4.1.8, we have

Alternatively, EY may be obtained directly from Theorem 4.1.2 by taking ф to be a function defined in (4.1.6). Thus

■<® 1 Г15

(4.1.8) EY = ф (x)f(x)dx = — ф (x)dx

-00 10 J 5