# Goodness-of-Fit

Other important output is included in Table 5.1. For instance, you’ll find the sum of squared errors (SSE) which gretl refers to as “Sum squared resid.” In this model SSE = 1718.94. To obtain the estimated variance, a2, divide SSE by the available degrees of freedom to obtain

The square root of this number is referred to by gretl as the “S. E. of regression” and is reported to be 4.88612. Gretl also reports R2 in this table. If you want to compute your own versions of these statistics using the total sum of squares from the model, you’ll have to use Analysis>ANOVA from the model’s pull-down menu to generate the ANOVA table. Refer to section 4.2 for details.

To compute your own from the standard gretl output recall that

The statistic Ty is printed by gretl and referred to as “S. D. of dependent variable” which is reported to be 6.48854. A little algebra reveals

SST = (N – 1)T2 = 74 * 6.48854 = 3115.485

Otherwise, the goodness-of-fit statistics printed in the gretl regression output or the ANOVA table are perfectly acceptable.

Gretl also reports the adjusted R2 in the standard regression output. The adjusted R2 imposes a small penalty to the usual R2 when a variable is added to the model. Adding a variable with any correlation to y always reduces SSE and increases the size of the usual R2. With the adjusted version, the improvement in fit may be outweighed by the penalty and adjusted R2 could become smaller as variables are added. The formula is:

This sometimes referred to as “R-bar squared,” (i. e., R2 ) although in gretl it is called “adjusted R-squared.” For Big Andy’s Burger Barn the adjusted R-squared is equal to 0.4329.

Once again the the printf command gives us some additional control over the format of the output. At the end of line 3 you’ll find an extra . This is gretl’s line continuation command. It tells gretl to continue reading the next line. It basically joins lines 3 and 4. The continuation command makes programs easier to print on a page. It looks slightly odd here since it immediately follows the line feed n, but n actually consists of two commands: a line feed and a continuation.

In this the critical values for the t72 and the p-values for the two statistics can be easily obtained using the command

1 scalar c=critical(t,$df,0.025)

2 pvalue t $df t1

3 pvalue t $df t2

These last three commands produce the output shown below:

Generated scalar c (ID 8) = 1.99346 t(72): area to the right of -7.21524 =~ 1 (to the left: 2.212e-010)

(two-tailed value = 4.424e-010; complement = 1) t(72): area to the right of 1.26257 = 0.105408 (two-tailed value = 0.210817; complement = 0.789183)

It is interesting to note that when a negative t-ratio is used in the pvalue function, gretl returns both the area to its right, the area to its left and the sum of the two areas. So, for the alternative hypothesis that the coefficient on price is less than zero (against the null that it is zero), the p-value is the area to the left of the computed statistic, which in this example is essentially zero.

The pvalue command can also be used as a function to return a scalar that can be stored by gretl.

*1 *scalar c=critical(t,$df,0.025)

*2 *scalar p1=pvalue(t, $df, t1)

3

*3 *scalar p2=pvalue(t, $df, t2)

*4 *printf "nThe.025 critical value from the t with %d degrees of freedom

*5 *is %.3f.n The pvalue from HO: b2=0 is %.3f and

*6 *from HO: b3=1 is %.3f.n",$df, c,p1,p2

This prints the following to the screen:

The.025 critical value from the t with 72 degrees of freedom is 1.993. The pvalue from HO: b2=0 is 1.000 and from HO: b3=1 is 0.105

1 set echo off

2 open "@gretldirdatapoeandy. gdt"

3 #Change the descriptive labels and graph labels

4 setinfo sales – d "Monthly Sales revenue ($1000)" – n "Monthly Sales ($1000)"

5 setinfo price – d "$ Price" – n "Price"

6 setinfo advert – d "Monthy Advertising Expenditure ($1000)" – n

7 "Monthly Advertising ($1000)

8

8 # print the new labels to the screen

9 labels

11

10 # summary statistics

11 summary sales price advert

14

12 # confidence intervals

13 ols sales const price advert –vcv

14 scalar bL = $coeff(price) – critical(t,$df,0.025) * $stderr(price)

15 scalar bU = $coeff(price) + critical(t,$df,0.025) * $stderr(price)

16 printf "The lower = %.2f and upper = %.2f confidence limits", bL, bU

20

17 # linear combination of parameters

18 ols sales const price advert –vcv

19 scalar chg = -0.4*$coeff(price)+0.8*$coeff(advert)

20 scalar se_chg=sqrt((-0.4)"2*$vcv[2,2]+(0.8"2)*$vcv[3,3]

21 +2*(-0.4)*(0.8)*$vcv[2,3])

22 scalar lb = chg-critical(t,$df,.05)*se_chg

23 scalar ub = chg+critical(t,$df,.05)*se_chg

24 printf "nExpected Change = %.4f and SE = %.4fn",chg, se_chg

25 printf "nThe 90%% confidence interval is [%.3f,%.3f]n",lb, ub

30

26 # significance tests

27 ols sales const price advert

28 scalar t1 = ($coeff(price)-0)/$stderr(price)

29 scalar t2 = ($coeff(advert)-0)/$stderr(advert)

30 printf "n The t-ratio for H0: b2=0 is = %.3f.n

31 The t-ratio for H0: b3=0 is = %.3f.n", t1, t2

37

32 pvalue t $df t1

33 scalar t3 = ($coeff(advert)-1)/$stderr(advert)

34 pvalue t $df t3

41

35 # t-test of linear combination

36 ols sales const price advert –vcv

37 scalar chg = -0.2*$coeff(price)-0.5*$coeff(advert)

38 scalar se_chg=sqrt((-0.2)"2*$vcv[2,2]+((-0.5)"2)*$vcv[3,3]

39 +2*(-0.2)*(-0.5)*$vcv[2,3])

40 scalar t_ratio = chg/se_chg

41 pvalue t $df t_ratio

# interaction creates nonlinearity series a2 = advert*advert

ols sales const price advert a2 —vcv scalar me1 = $coeff(advert)+2*(0.5)*$coeff(a2) scalar me2 = $coeff(advert)+2*2*$coeff(a2)

printf "nThe marginal effect at $500 (advert=.5) is %.3f and at $2000 is %.3fn",me1,me2

# delta method for nonlinear hypotheses ols sales const price advert a2 –vcv matrix b = $coeff

matrix cov = $vcv

scalar lambda = (1-b[3])/(2*b[4])

scalar d3 = -1/(2*b[4])

scalar d4 = -1*(1-b[3])/(2*b[4]~2)

matrix d = { 0, 0, d3, d4}

scalar v = d*cov*d’

scalar se = sqrt(v)

scalar lb = lambda – critical(t,$df,.025)*se scalar ub = lambda + critical(t,$df,.025)*se

printf "nThe estimated optimal level of advertising is $%.2f.n",1000*lambda printf "nThe 95%% confidence interval is ($%.2f, $%.2f).n",1000*lb,1000*ub

# interaction and marginal effects open "@gretldirdatapoepizza4.gdt" series inc_age=income*age

ols pizza const age income inc_age scalar me1 = $coeff(age)+$coeff(inc_age)*25 scalar me2 = $coeff(age)+$coeff(inc_age)*90 printf "nThe marginal effect of age for someone with $25,000/year income is %.2f.n",me1 printf "nThe marginal effect of age for someone with $90,000/year income is %.2f.n",me2

open "@gretldirdatapoecps4_small. gdt" logs wage

49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |

series ed_exp=educ*exper ols l_wage const educ exper ed_exp scalar me8 = $coeff(exper)+8*$coeff(ed_exp) scalar me16 = $coeff(exper)+16*$coeff(ed_exp) printf "nThe marginal effect of exper for someone with 8 years of schooling is %.3f%%.n",100*me8 printf "nThe marginal effect of exper for someone with 16 years of schooling is %.3f%%.n",100*me16

## Leave a reply