# Further Inference in the Multiple Regression Model

In this chapter several extensions of the multiple linear regression model are considered. First, we test joint hypotheses about parameters in a model and then learn how to impose linear restric­tions on the parameters. A condition called collinearity is also explored.

6.1 F-test

An F-statistic can be used to test multiple hypotheses in a linear regression model. In linear regression there are several different ways to derive and compute this statistic, but each yields the same result. The one used here compares the sum of squared errors (SSE) in a regression model estimated under the null hypothesis (H0) to the SSE of a model under the alternative (Hi). If the sum of squared errors from the two models are similar, then there is not enough evidence to reject the restrictions. On the other hand, if imposing restrictions implied by H0 alter SSE substantially, then the restrictions it implies don’t fit the data and we reject them.

In the Big Andy’s Burger Barn example we estimated the model

salesi = в1 + e2price + вз advert + в4 advert2 + e (6.1)

Suppose we wish to test the hypothesis that advertising has no effect on average sales against the alternative that it does. Thus, H0 : в3 = в4 = 0 and H1 : в3 = 0 or в4 = 0. Another way to

express this is in terms of the models each hypothesis implies.

Ho :ві + 02price + e

The model under H0 is restricted compared to the model under Hi since in it вз = 0 and = 0. The F-statistic used to test H0 versus Hi estimates each model by least squares and compares their respective sum of squared errors using the statistic:

The sum of squared errors from the unrestricted model (Hi) is denoted SSEU and that of the restricted model (H0) is SSEr. The numerator is divided by the number of hypotheses being tested, J. In this case that is 2 since there are two restrictions implied by H0. The denominator is divided by the total number of degrees of freedom in the unrestricted regression, N — K. N is the sample size and K is the number of parameters in the unrestricted regression. When the errors of your model are (1) independently and identically distributed (iid) normals with zero mean and constant variance (et iid N(0, a2)) and (2) H0 is true, then this statistic has an F distribution with J numerator and N — K denominator degrees of freedom. Choose a significance level and compute this statistic. Then compare its value to the appropriate critical value from the F table or compare its p-value to the chosen significance level.

The script to estimate the models under H0 and Hi and to compute the test statistic is given below.

1 open "@gretldirdatapoeandy. gdt"

4 scalar sseu = \$ess

5 scalar unrest_df = \$df

6 ols sales const price

7 scalar sser = \$ess

8 scalar Fstat=((sser-sseu)/2)/(sseu/(unrest_df))

9 pvalue F 2 unrest_df Fstat

The first thing to notice is that a gretl function is used to create advert2. In line 2 the square command will square any variable or variables that follow. In doing so, the string sq_ is appended as a prefix to the original variable name, so that squared advertising (advert2) becomes sq_advert.

Gretl refers to the sum of squared residuals (SSE) as the “error sum of squares” and it is retrieved from the regression results using the accessor \$ess (i. e., in line 3 scalar sseu = \$ess. In this case, the accessor \$ess points to the error sum of squares computed in the regression that precedes it. You’ll also want to save the degrees of freedom in the unrestricted model so that you can use it in the computation of the p-value for the F-statistic. In this case, the F-statistic has 2 known parameters (J=1 and N — K=unrest_df) that are used as arguments in the pvalue function.

There are a number of other ways within gretl to do this test. These are available through scripts, but it may be useful to demonstrate how to access them through the GUI. First, you’ll want to estimate the model using least squares. From the pull-down menu (see Figure 5.1) se­
lect Model>Ordinary Least Squares, specify the unrestricted model (Figure 5.2), and run the regression. This yields the result shown in Figure 6.1.

You’ll notice that along the menu bar at the top of this window there are a number of options that are available to you. Choose Tests and the pull-down menu shown in Figure 6.2 will be revealed. The first four options in 6.2 are highlighted and these are the ones that are most pertinent

 File Edit J[e5ts) Save Graphs Analysis LaTeX Model 3: Depender TOmit variables) Ada variables Sum of coefficients 1-75 Linear restrictions error t-retio F const Non-linearity (squares) 9305 16.14 1. price Non-linearity (logs) 4534 -7.304 3 . advert 5616 3.417 0. sq ads Ramsey’s RESET Heteroskedasticity ► 40624 -2.343 0. Mean dej Sum sque Normality of residual Influential observations S. D. dependent var 5.E. of regression R-3quare Adjusted R-squared F(3, 71) Collinearity F-value(F) Log-like Scnwarz Glow test Autocorrelation Akaike criterion Hannan-Quinn

Figure 6.2: Choosing Tests from the pull-down menu of the model window reveals several testing options to the discussion here. This menu provides you an easy way to omit variables from the model, add variables to the model, test a sum of your coefficients, or to test arbitrary linear restrictions on the parameters of your model.

Since this test involves imposing a zero restriction on the coefficient of the variable price, we can use the Omit variables option. This brings up the dialog box shown in Figure 6.3. Notice the two radio buttons at the bottom of the window. The first is labeled Estimate reduced model and this is the one you want to use to compute equation 6.2. If you select the other, no harm is done. It is computed in a different way, but produces the same answer in a linear model. The only advantage of the Wald test (second option) is that the restricted model does not have to be estimated in order to perform the test. Consequently, when you use the —wald option, the restricted model is not estimated and the unrestricted model remains in gretl’s memory where its statistics can be accessed.

Select the variable P and click OK to reveal the result shown in Figure 6.4. The interesting thing about this option is that it mimics your manual calculation of the F statistic from the script. It computes the sum of squared errors in the unrestricted and restricted models and computes equation (6.2) based on those regressions. Most pieces of software choose the alternative method (Wald) to compute the test, but you get the same result.1

You can also use the linear restrictions option from the pull-down menu shown in Figure 6.2. This produces a large dialog box that requires a bit of explanation. The box appears in Figure 6.5. The restrictions you want to impose (or test) are entered here. Each restriction in the set should be expressed as an equation, with a linear combination of parameters on the left and a numeric value to the right of the equals sign. Parameters are referenced in the form b[variable number], where variable number represents the position of the regressor in the equation, which starts with

1. This means that в3 is equivalent to b[3]. Restricting в3 = 0 is done by issuing b[3]=0 and setting в4 = 0 by b[4]=0 in this dialog. Sometimes you’ll want to use a restriction that involves a

xOf course, if you had used the —robust option with ols, then a Wald calculation is done. This is discussed in chapter 8.

multiple of a parameter e. g., 3вз = 2. The basic principle is to place the multiplier first, then the parameter, using * to multiply. So, in this case the restriction in gretl becomes 3*b[3] = 2.

When you use the console or a script instead of the pull-down menu to impose restrictions, you’ll have to tell gretl where the restrictions start and end. The restrictions start with a restrict statement and end with end restrict. The statement will look like this: