# Some Basic Probability Concepts

In this chapter, you learned some basic concepts about probability. Since the actual values that economic variables take on are not actually known before they are observed, we say that they are random. Probability is the theory that helps us to express uncertainty about the possible values of these variables. Each time we observe the outcome of a random variable we obtain an observation. Once observed, its value is known and hence it is no longer random. So, there is a distinction to be made between variables whose values are not yet observed (random variables) and those whose values have been observed (observations). Keep in mind, though, an observation is merely one of many possible values that the variables can take. Another draw will usually result in a different value being observed.

A probability distribution is just a mathematical statement about the possible values that our random variable can take on. The probability distribution tells us the relative frequency (or probability) with which each possible value is observed. In their mathematical form probability dis­tributions can be rather complicated; either because there are too many possible values to describe succinctly, or because the formula that describes them is complex. In any event, it is common summarize this complexity by concentrating on some simple numerical characteristics that they possess. The numerical characteristics of these mathematical functions are often referred to as parameters. Examples are the mean and variance of a probability distribution. The mean of a probability distribution describes the average value of the random variable over all of its possible realizations. Conceptually, there are an infinite number of realizations therefore parameters are not known to us. As econometricians, our goal is to try to estimate these parameters using a finite amount of information available to us. We collect a number of realizations (called a sample) and then estimate the unknown parameters using a statistic. Just as a parameter is an unknown numer­ical characteristic of a probability distribution, a statistic is an observable numerical characteristic of a sample. Since the value of the statistic will be different for each sample drawn, it too is a random variable. The statistic is used to gain information about the parameter.

Expected values are used to summarize various numerical characteristics of a probability dis­tributions. For instance, if X is a random variable that can take on the values 0,1,2,3 and these values occur with probability 1/6, 1/3, 1/3, and 1/6, respectively. The average value or mean of the probability distribution, designated ц, is obtained analytically using its expected value.

Ц = E[X] = Y, xf (x) = 0 ■ 6 + 1 ■ 3 + 2 ■ 3 + 3 ■ 1 = 3 (B.1)

So, ц is a parameter. Its value can be obtained mathematically if we know the probability density function of the random variable, X. If this probability distribution is known, then there is no reason to take samples or to study statistics! We can ascertain the mean, or average value, of a random variable without every firing up our calculator. Of course, in the real world we only know that the value of X is not known before drawing it and we don’t know what the actual probabilities are that make up the density function, f (x). In order to Figure out what the value of ц is, we have to resort to different methods. In this case, we try to infer what it is by drawing a sample and estimating it using a statistic.

One of the ways we bridge the mathematical world of probability theory with the observable world of statistics is through the concept of a population. A statistical population is the collection of individuals that you are interested in studying. Since it is normally too expensive to collect information on everyone of interest, the econometrician collects information on a subset of this population-in other words, he takes a sample.

The population in statistics has an analogue in probability theory. In probability theory one must specify the set of all possible values that the random variable can be. In the example above, a random variable is said to take on 0,1,2, or 3. This set must be complete in the sense that the variable cannot take on any other value. In statistics, the population plays a similar role. It consists of the set that is relevant to the purpose of your inquiry and that is possible to observe. Thus it is common to refer to parameters as describing characteristics of populations. Statistics are the analogues to these and describe characteristics of the sample.

This roundabout discussion leads me to an important point. We often use the words mean, variance, covariance, correlation rather casually in econometrics, but their meanings are quire different depending on whether we are refereing to a probability distribution or a sample. When referring to the analytic concepts of mean, variance, covariance, and correlation we are specifically talking about characteristics of a probability distribution; these can only be ascertained through complete knowledge of the probability distribution functions. It is common to refer to them in this sense as population mean, population variance, and so on. These concepts do not have anything to do with samples or observations!

In statistics we attempt to estimate these (population) parameters using samples and explicit formulae. For instance, we might use the average value of a sample to estimate the average value of the population (or probability distribution).

 Probability Distribution Sample mean E [X] = ^ 1 _ x% — x variance E[X – ^]2 = a2 _-ї Е(хг – x)2 — sX

When you are asked to obtain the mean or variance of random variables, make sure you know whether the person asking wants the characteristics of the probability distribution or of the sample. The former requires knowledge of the probability distribution and the later requires a sample.

In gretl you are given the facility to obtain sample means, variances, covariances and correla­tions. You are also given the ability to compute tail probabilities using the normal, t-, F and x2 distributions. First we’ll examine how to get summary statistics.

Summary statistics usually refers to some basic measures of the numerical characteristics of your sample. In gretl, summary statistics can be obtained in at least two different ways. Once your data are loaded into the program, you can select Data>Summary statistics from the pull-down menu. Which leads to the output in Figure B.2. The other way to get summary statistics is from

Figure B.1: Choosing summary statistics from the pull-down menu

the console or script. Recall, gretl is really just a language and the GUI is a way of accessing that language. So, to speed things up you can do this. Load the dataset and open up a console window. Then type summary. This produces summary statistics for all variables in memory. If you just want summary statistics for a subset, then simply add the variable names after summary, i. e., summary x gives you the summary statistics for the variable x.

Gretl computes the sample mean, median, minimum, maximum, standard deviation (S. D.), coefficient of variation (C. V.), skewness and excess kurtosis for each variable in the data set. You may recall from your introductory statistics courses that there are an equal number of observations in your sample that are larger and smaller in value than the median. The standard deviation is the square root of your sample variance. The coefficient of variation is simply the standard deviation divided by the sample mean. Large values of the C. V. indicate that your mean is not very precisely measured. Skewness is a measure of the degree of symmetry of a distribution. If the left tail (tail at small end of the the distribution) extends over a relatively larger range of the variable than the

Figure B.2: Choosing summary statistics from the pull-down menu yields these results.

right tail, the distribution is negatively skewed. If the right tail covers a larger range of values then it is positively skewed. Normal and t-distributions are symmetric and have zero skewness. The хП is positively skewed. Excess kurtosis refers to the fourth sample moment about the mean of the distribution. ‘Excess’ refers to the kurtosis of the normal distribution, which is equal to three. Therefore if this number reported by gretl is positive, then the kurtosis is greater than that of the normal; this means that it is more peaked around the mean than the normal. If excess kurtosis is negative, then the distribution is flatter than the normal.

 Sample Statistic Formula Mean E Xi/n = X Variance 7—I £(Xi – X)2 = sX Standard Deviation s = Vs2 Coefficient of Variation s/X Skewness n—i E(xi – x)3/s3 Excess Kurtosis n—i E(Xi – x)4/s4 – 3

You can also use gretl to obtain tail probabilities for various distributions. For example if X ~ N(3, 9) then P(X > 4) is

P[X > 4] = P[Z > (4 – 3)/V9] = P[Z > 0.334]=0.3694 (B.2)

To obtain this probability, you can use the Tools>P-value finder from the pull-down menu. Then, give gretl the value of X, the mean of the distribution and its standard deviation using

the dialog box shown in Figure B.3. The result appears in Figure B.4. Gretl is using the mean

Figure B.3: Dialog box for finding right hand side tail areas of various probability distributions.

Figure B.4: Results from the p value finder of P[X > 4] where X ~ N(3, 9). Note, the area in the tail of this distribution to the right of 4 is.369441.

and standard deviation to covert the normal to a standard normal (i. e., z-score). As with nearly everything in gretl, you can use a script to do this as well. First, convert 4 from the X N(3, 9) to a standard normal, X ~ N(0,1). That means, subtract its mean, 3, and divide by its standard error, /9. The result is a scalar so, open a script window and type:

scalar z1 = (4-3)/sqrt(9)

Then use the cdf function to compute the tail probability of z1. For the normal cdf this is

scalar cl = 1-cdf(z, z1)

The first argument of the cdf function, z, identifies the probability distribution and the second, z1, the number to which you want to integrate. So in this case you are integrating a standard normal cdf from minus infinity to z1=.334. You want the other tail (remember, you want the probability that Z is greater than 4) so subtract this value from 1.

In your book you are given another example X ~ N(3, 9) then find P(4 < X < 6) is

P[4 < X < 6] = P[0.334 < Z < 1] = P[Z < 1] – P[Z < .33] (B.3)

Take advantage of the fact that P[Z < z] = 1 — P[Z > z] to obtain use the p-value finder to obtain:

(1 — 0.1587) — (1 — 0.3694) = (0.3694 — 0.1587) = 0.2107 (B.4)

Note, this value differs slightly from the one given in your book due to rounding error that occurs from using the normal probability table. When using the table, the P[Z < .334] was truncated to P[Z < .33]; this is because your tables are only taken out to two decimal places and a practical decision was made by the authors of your book to forgo interpolation (contrary to what your Intro to Statistics professor may have told you, it is hardly ever worth the effort to interpolate when you have to do it manually). Gretl, on the other hand computes this probability out to machine precision as P[Z < 3]. Hence, a discrepancy occurs. Rest assured though that these results are, aside from rounding error, the same.

Using the cdf function makes this simple and accurate. The script is

scalar z1 = (4-3)/sqrt(9) scalar z2 = (6-3)/sqrt(9) scalar cl = cdf(z, z1) scalar c2 = cdf(z, z2) scalar area = c2-c1

Gretl has a handy new feature that allows you to plot probability distributions. If you’ve ever wondered what a Weibull(10,0.4) looks like then this is the utility you have waited for. From the main menu choose Tools>Distribution graphs from the main menu. The following dialog will appear:

You can plot normal, t, x2, F, binomial, poisson, and weibull probability density functions. Fill in the desired parameters and click OK. For the normal, you can also tell gretl whether you want the pdf or the cdf. This utility is closely related to another that allows you to plot a curve. The curve plotting dialog is also found in the Tools menu.

The dialog box allows you to specify the range of the graph as well as the formula, which must be a function of x. Once the graph is plotted you can edit it in the usual way and add additional formulae and lines as you wish. Please note that gnuplot uses ** for exponentiation (raising to a power).