Testing Equivalence of Two Regions
The question arises, is the wage equation different for the south than for the rest of the country? There are two ways to do this in gretl. One is very easy and the other not so easy, but makes for a useful example of how to use loops to create interactions among variables.
A Chow test is used to test for structural breaks or changes in a regression. In other words, one subsample has different intercept and slopes than another. It can be used to detect structural breaks in timeseries models or to determine whether, in our case, the south’s wages are determined differently from those in the rest of the country. The easy method uses gretl’s builtin chow command to test for a change in the regression. It must follow a regression and you must specify the indicator variable that identifies the two subsets.
To illustrate its use, consider the basic wage model
wage = ві + e2educ + 51black + 52 female
+y( black x female) + e
Now, if wages are determined differently in the south, then the slopes and intercept for southerners will be different. The null hypothesis is that the coefficients of the two subsets are equal and the alternative is that they are not. The gretl commands to perform the test are:
1 list x = const educ black female blk_fem
2 ols wage x
3 chow south —dummy
Since the regressors are going to be used again below, I put them into a list to simplify things later. Line 2 estimates the model using least squares. Line 3 contains the test command. It is initiated by chow followed by the indicator variable that is used to define the subsets, in this case south. The option is used to tell gretl that south is an indicator. When the —dummy option is used, chow tests the null hypothesis of structural homogeneity with respect to that dummy. Essentially, gretl is creating interaction terms between the indicator and each of the regressors and adding them to the model. We will replicate this below in a script.
Figure 7.2: Click Tests>Chow test from a model window to reveal the dialog box for the Chow test. Select an indicator variable or a break point for the sample.
The dialog box to perform the Chow test is found in the model window. After estimating the regression via the GUI the model window appears. Click Tests>Chow test on its menu bar to open the dialog box in Figure 7.2. The results from the test appear below.
Augmented regression for Chow test OLS, using observations 11000 Dependent variable: wage
coefficient std. error 
tratio 
pvalue 

const 
6.60557 
2.33663 
2.827 
0.0048 
*** 
educ 
2.17255 
0.166464 
13.05 
4.89e036 
*** 
black 
5.08936 
2.64306 
1.926 
0.0544 
* 
female 
5.00508 
0.899007 
5.567 
3.33e08 
*** 
blk_fem 
5.30557 
3.49727 
1.517 
0.1296 

south 
3.94391 
4.04845 
0.9742 
0.3302 

so_educ 
0.308541 
0.285734 
1.080 
0.2805 

so_black 
1.70440 
3.63333 
0.4691 
0.6391 

so_female 
0.901120 
1.77266 
0.5083 
0.6113 

so_blk_fem 
2.93583 
4.78765 
0.6132 
0.5399 

Mean dependent 
var 20.61566 
S. D. dependent var 
12.83472 

Sum squared resid 129984.4 
S. E. of 
regression 
11.45851 

Rsquared 
0.210135 
Adjusted Rsquared 
0.202955 

F(9, 990) 
29.26437 
Pvalue(F) 
2.00e45 

Loglikelihood 
3852.646 
Akaike criterion 
7725.292 

Schwarz criterion 7774.369 
HannanQuinn 
7743.944 
Chow test for structural difference with respect to south F(5, 990) = 0.320278 with pvalue 0.9009
Notice that the pvalue associated with the test is 0.901, thus providing insufficient evidence to convince us that wages are structurally different in the south.
The other way to do this uses a loop to manually construct the interactions. Though the chow command makes this unnecessary, it is a great exercise that demonstrates how to create more general interactions among variables. The variable south will be interacted with each variable in a list and then added to a new list. The script is:
1 list x = const educ black female blk_fem
2 list dx = null
3 loop foreach i x
4 series south_$i = south * $i
5 list dx = dx south_$i
6 endloop
The first line includes each of the variables in the model that are to be interacted with south. The statement list dx = null creates a new list called dx that is empty (i. e., = null). In line 3 a foreach loop is initiated using the index i and it will increment through each element contained in the list, x. Line 4 creates a new series named south_varname that is constructed by interacting south with each variable in x. This is added to the new list, dx and the loop is closed.
To make it clear, let’s go through a couple of iterations of the loop:
i=1
column 1 of x = const
series south_const = south * const
dx = dx south_const
implies dx = null south_const so, dx = south_const loop ends–increment i i=2
column 2 of x = educ
series south_educ = south * educ
dx = dx south_educ
so, dx = south_const south_educ loop ends–increment i i=3
column 3 of x = black
series south_black = south * black
dx = dx south_black
so, dx = south_const south_educ south_black loop ends–increment i i=4
and so on…
The interactions are created and a series of regressions are estimated and put into a model table. The remaining script is:
2 ols wage x dx
3 scalar sseu=$ess
4 scalar dfu = $df
5 modeltab add
6
6 smpl south=1 —restrict
7 ols wage x
8 modeltab add
9 smpl full
11
10 smpl south=0 —restrict
11 ols wage x
12 modeltab add
13 smpl full
14 modeltab show
Notice that the smpl command is used to manipulate subsets. It is restricted to observations in the south in line 7, restored to full in line 10, and then restricted to nonsouth observations in line 12. Also, the sum of squared errors from the unrestricted model is saved. These will be used to manually construct a Chow test below.
The model table appears below
OLS estimates
Dependent variable: wage
(1) 
(2) 
(3) 

const 
6.606** 
2.662 
6.606** 
(2.337) 
(3.420) 
(2.302) 

educ 
2.173** 
1.864** 
2.173** 
(0.1665) 
(0.2403) 
(0.1640) 

black 
5.089* 
3.385 
5.089* 
(2.643) 
(2.579) 
(2.604) 

female 
5.005** 
4.104** 
5.005** 
(0.8990) 
(1.581) 
(0.8857) 

blk_fem 
5.306 
2.370 
5.306 
(3.497) 
(3.383) 
(3.446) 
south_const 3.944
(4.048)
south_educ 0.3085
(0.2857)
south_black 1.704
(3.633)
southfemale
0.9011 (1.773)

Standard errors in parentheses * indicates significance at the 10 percent level ** indicates significance at the 5 percent level
The first column contains the results from the model with all of the interactions. The second column is for workers in the south, and the third is for workers elsewhere.
The code to perform the Chow test uses the sumofsquared errors and degrees of freedom that were saved in the unrestricted estimation and computes an Fstatistic using that from the restricted regression.
1 smpl full
2 ols wage x
3 scalar sser = $ess
4 scalar fstat = ((ssersseu)/5)/(sseu/dfu)
5 pvalue f 5 dfu fstat
Be sure to restore the full sample before estimating the restricted model. The restricted regression pools observations from the entire country together and estimates them with common coefficients. It is restricted because the parameters are the same in both subsets.
F(5, 990): area to the right of 0.320278 = 0.900945 (to the left: 0.0990553)
These results match those from the builtin version of the test.
Leave a reply