Some Statistical Concepts

The hip data are used to illustrate computations for some simple statistics in your text.

C.1 Summary Statistics

Using a script or operating from the console, open the hip data, hip. gdt, and issue the summary command. This yields the results shown in Table C.1. This gives you the mean, median, mini­Summary Statistics, using the observations 1-50 for the variable ‘y’ (50 valid observations)

Mean

Median

Minimum

Maximum

Standard deviation C. V.

Skewness Ex. kurtosis

Table C.1: Summary statistics from the hip data

mum, maximum, standard deviation, coefficient of variation, skewness and excess kurtosis of your variable(s). Once the data are loaded, you can use gretl’s language to generate these as well. For instance, scalar y_bar = mean(y) yields the mean of the variable y. To obtain the sample vari­ance use scalar y_var = sum((y-y_bar)"2)/($nobs-1). The script below can be used to compute other summary statistics as discussed in your text.

1 open "@gretldirdatapoehip. gdt"

2 summary

3 scalar y_bar = mean(y)

4 scalar y_var = sum((y-y_bar)"2)/($nobs-1)

5 scalar y_se = sqrt(y_var)

6 scalar se_ybar = sqrt(y_var/$nobs)

7

8 scalar mu2 = sum((y-y_bar)"2)/($nobs)

9 scalar mu3 = sum((y-mean(y))"3)/($nobs)

10 scalar mu4 = sum((y-mean(y))"4)/($nobs)

11 printf "n mean = %5.4fn sample variance = %5.4fn sample

12 std deviation = %5.4fn",y_bar, y_var, y_se

13 printf "n mu2 = %5.4fn mu3 = %5.4fn mu4 = %5.4fn",mu2,mu3,mu4

Then, to estimate skewness, S = J?/a3, and excess kurtosis, K = /t4/<r4 — 3:

1 scalar sig_tild = sqrt(mu2)

2 scalar skew = mu3/sig_tild"3

3 scalar ex_kurt = mu4/sig_tild"4 -3

4 printf "n std dev. of the mean = %5.4fn skewness = %5.4fn

5 excess kurtosis = %5.4fn",se_ybar, skew, ex_kurt

Note, in gretl’s built in summary command, the excess kurtosis is reported. The normal dis­tribution has a theoretical kurtosis equal to 3 and the excess is measured relative to that. Hence, excess kurtosis = Д4/<г4 — 3

If hip size in inches is normally distributed, Y ~ N(^,a2). Based on our estimates, Y ~ N(17.158, 3.265). The percentage of customers having hips greater than 18 inches can be estimated.

(C.1)

Replacing ^ and a by their estimates yields

1 scalar zs = (18 – mean(y))/sd(y)

2 pvalue z zs

The last line actually computes the p-value associated with z-score. So, the pvalue command requests that a p-value be returned, the second argument (z) indicates the distribution to be used (in this case, z indicates the normal), and the final argument (zs) is the statistic itself, which is computed in the previous line. The result is 0.3207, indicating that about 32% of the population would not fit into a seat that is 18 inches wide.

How large would a seat have to be to be able to fit 95% of the population? Find y* to satisfy

In gretl you need to find the value of Z = (y* — y)/a that satisfies the probability. The invcdf function does this. Since Z is standard normal

1 scalar zz = invcdf(n,.95)

2 scalar ystar = sd(y)*zz+mean(y)

3 print ystar

The seat width is estimated to be 20.13 inches.

Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>