# Coefficient of Determination

One use of regression analysis is to “explain” variation in dependent variable as a function of the independent variable. A summary statistic that is used for this purpose is the coefficient of determination, also known as R2.

There are a number of different ways of obtaining R2 in gretl. The simplest way to get R2 is to read it directly off of gretl’s regression output. This is shown in Figure 4.3. Another way, and probably the most difficult, is to compute it manually using the analysis of variance (ANOVA) table. The ANOVA table can be produced after a regression by choosing Analysis>ANOVA from the model window’s pull-down menu as shown in Figure 4.1. Or, one can simply use the —anova option to ols to produce the table from the console of as part of a script.

ols income const income –anova

The result appears in Figure 4.2.

In the ANOVA table featured in Figure 4.2 the SSR, SSE, and SST can be found. Gretl also does the R2 computation for you as shown at the bottom of the output. If you want to verify gretl’s computation, then

SST = SSR + SSE = 190627 + 304505 = 495132 (4.4) Figure 4.1: After estimating the regression, select Analysis>ANOVA from the model window’s pull­down menu.     Figure 4.2: The ANOVA table

Different authors refer to regression sum of squares, residual sum of squares and total sum of squares by different acronyms. So, it pays to be careful when computing R2 manually. POE4 refers to the regression sum of squares as SSR and the residual sum of squares as SSE (sum of squared errors).

Finally, you can think of R2 is as the squared correlation between your observations on your dependent variable, food-exp, and the predicted values based on your estimated model, food-exp. A gretl script to compute this version of the statistic is is found below in section 4.5.4.

To use the GUI you can follow the steps listed here. Estimate the model (equation 2.1) using least squares and add the predicted values from the estimated model, food-exp, to your data set. Then use the gretl correlation matrix to obtain the correlation between food-exp and food-exp.

Adding the fitted values to the data set from the pull-down menu in the model window is illustrated in Figure 4.4 below. Highlight the variables food_exp, income, and yhatl by holding the control key down and mouse-clicking on each variable in the main gretl window as seen in File Edit Tests Save і Graphs Analysis LaTeX Model 5: OLS, Dependent vai (^^itted value^^ Residuals Squared residuals Error sum of squares о p-value const Standard error of the regression 0.0622 * income R-squared T*R-squared 1.95e-05 *** Mean depender Log likelihood var 112.6752 Sum squared i Akaike Information Criterion sion 89.51700

Figure 4.4: Using the pull-down menu in the Model window to add fitted values to your data set.

Figure 4.5 below. Then, View>Correlation Matrix will produce all the pairwise correlations between each variable you’ve chosen. These are arranged in a matrix as shown in Figure 4.6. Notice that the correlation between food-exp and income is the same as that between food-exp and food-exp (i. e., 0.6205). As shown in your text, this is no coincidence in the simple linear regression model. Also, squaring this number equals R2 from your regression, 0.62052 = .385.

You can generate pairwise correlations from the console using

cl = corr(food_exp,\$yhat)

In yet another example of the ease of using gretl, the usual scalar or genr is not used before c1. Gretl identifies correctly that the result is a scalar and you can safely omit the command. In longer scripts, however, its generally a good idea to tell gretl what you intend to compute and if the result doesn’t match you’ll get an error message.