Heteroskedasticity
Violation of assumption 2, means that the disturbances have a varying variance, i. e., E(u2) = a2, i = 1,2,… ,n. First, we study the effect of this violation on the OLS estimators. For the simple
linear regression it is obvious that Pols given in equation (5.2) is still unbiased and consistent because these properties depend upon assumptions 1 and 4, and not assumption 2. However, the variance of POLS is now different
where the second equality follows from assumption 3 and the fact that var(ui) is now a2. Note that if a2 = a2, this reverts back to a2/ = x“2, the usual formula for var(eOLS) under homoskedasticity. Furthermore, one can show that E(s2) will involve all of the a2’s and not one common a2, see problem 1. This means that the regression package reporting s2/ ЕП=1 x2 as the estimate of the variance of Pols is committing two errors. One, it is not using the right formula for the variance, i. e., equation (5.9). Second, it is using s2 to estimate a common a2 when in fact the a2’s are different. The bias from using s‘2/Y^i=1 x2 as an estimate of var(eOLS) will depend upon the nature of the heteroskedasticity and the regressor. In fact, if ai2 is positively related to x2, one can show that s2/E7=1 x2 understates the true variance and hence the t – statistic reported for в = 0 is overblown, and the confidence interval for в is tighter than it is supposed to be, see problem 2. This means that the tstatistic in this case is biased towards rejecting Ho; в = 0, i. e., showing significance of the regression slope coefficient, when it may not be significant.
The OLS estimator of в is linear unbiased and consistent, but is it still BLUE? In order to answer this question, we note that the only violation we have is that the var(ui) = a2. Hence, if we divided ui by ai/a, the resulting u* = aui/ai will have a constant variance a2. It is easy to show that u* satisfies all the classical assumptions including homoskedasticity. The regression model becomes
aYi/ai = aa/ai + eaXi/ai + u*
and OLS on this model (5.10) is BLUE. The OLS normal equations on (5.10) are E?.1(Yi/a?) = a En.1(1/a?) + в E’EXi/a2)
E“=1(YiA’i/a,2) = a E“=1 (Xi/a2) + в E’EX/a,2)
Note that a2 drops out of these equations. Solving (5.11), see problem 3, one gets
a = [E7=1(Y1i/a2)/£7=1(1/a2)] – =E7= 1(Xi/a2)^7=1(1/a2)] = Y* – ~PX* (5.12a)
with Y* = [ET=1(Yi/a2)/EJUVa.2)] = E7=1 w*Yi/ E7=1 w* and – X * = [E7=1 (Xi/a2)/E7=1(1/a2)] = E7=1 w*Xi/ E7=1 w*
where w* = (1/a2). Similarly,
[E7=1(1/ai2)][E7=1(YiXi/a2)] – [E7=1(Xi/a2)][E7=1(Yi/a2)]
[E7=1 X2/a2]^n=1(1/a2)] – [E7=1 (Xi/a2)]2
(En=1 <)(EI=1 w*XiYi) – (Ei=1 w*Xi)(Y%=1 w*Yi) ( 7=1 w*)( 7=1 w*X2) – ( i=1 w*Xi)2
E7=1 w*(Xi – XX*)(Yi – Y*)
7=1 w*(Xi – X*)2
It is clear that the BLU estimators й and в, obtained from the regression in (5.10), are different from the usual OLS estimators aOLS and Pols since they depend upon the a2’s. It is also true that when a2 = a2 for all i = 1, 2,…,n, i. e., under homoskedasticity, (5.12) reduces to the usual OLS estimators given by equation (3.4) of Chapter 3. The BLU estimators weight the ith observation by (1/a^ which is a measure of precision of that observation. The more precise the observation, i. e., the smaller ai, the larger is the weight attached to that observation. а and в are also known as Weighted Least Squares (WLS) estimators which are a specific form of Generalized Least Squares (GLS). We will study GLS in details in Chapter 9, using matrix notation.
Under heteroskedasticity, OLS looses efficiency in that it is no longer BLUE. However, because it is still unbiased and consistent and because the true a2’s are never known some researchers compute OLS as an initial consistent estimator of the regression coefficients. It is important to emphasize however, that the standard errors of these estimates as reported by the regression package are biased and any inference based on these estimated variances including the reported tstatistics are misleading. White (1980) proposed a simple procedure that would yield heteroskedasticity consistent standard errors of the OLS estimators. In equation (5.9), this amounts to replacing a2 by e2, the square of the ith OLS residual, i. e.,
Note that we can not consistently estimate ai2 by ei2, since there is one observation per parameter estimated. As the sample size increases, so does the number of unknown ai2 s. What White (1980) consistently estimates is the var(eOLS) which is a weighted average of the e2. The same analysis applies to the multiple regression OLS estimates. In this case, White s (1980) heteroskedasticity consistent estimate of the variance of the kth OLS regression coefficient вк, is given by
White’ s var(°k) = EEi °kiei2/(EI=i pki)2
where ok is the squared OLS residual obtained from regressing Xk on the remaining regressors in the equation being estimated. ei is the ith OLS residual from this multiple regression equation. Many regression packages provide White s heteroskedasticityconsistent estimates of the variances and their corresponding robust tstatistics. For example, using EViews, one clicks on Quick, choose Estimate Equation. Now click on Options, a menu appears where one selects White to obtain the heteroskedasticityconsistent estimates of the variances.
While the regression packages correct for heteroskedasticity in the tstatistics they do not usually do that for the Fstatistics studied, say in Example 2 in Chapter 4. Wooldridge (1991) suggests a simple way of obtaining a robust LM statistic for Ho; в2 = в3 = 0 in the multiple regression (4.1). This involves the following steps:
(1) Run OLS on the restricted model without X2 and X3 and obtain the restricted least squares residuals й.
(2) Regress each of the independent variables excluded under the null (i. e., X2 and X3) on all of the other included independent variables (i. e., X4,X5,…, XK) including the constant. Get the corresponding residuals o2 and p3, respectively.
(3) Regress the dependent variable equal to 1 for all observations on o2й, 03й without a constant and obtain the robust LM statistic equal to the n – the sum of squared residuals of
this regression. This is exactly nRU of this last regression. Under H0 this LM statistic is distributed as x2>.
Since OLS is no longer BLUE, one should compute й and в. The only problem is that the o^s are rarely known. One example where the oVs are known up to a scalar constant is the following simple example of aggregation.
Example 5.1: Aggregation and Heteroskedasticity. Let Yij be the observation on the jth firm in the ith industry, and consider the following regression:
Yij = a + pXij + uij j = 1, 2,…,m; i = 1, (5.14)
If only aggregate observations on each industry are available, then (5.14) is summed over firms, i. e.,
Yi = ani + вХі + Ui i = 1,2,…,m (5.15)
where Yi = YjL і Yij, XV = YjL іXij, U = Yj= і uij for i = 1,2,…,m. Note that although the Uj’s are IID(0,o2), by aggregating, we get u ~ (0,nio2). This means that the disturbances in (5.15) are heteroskedastic. However, o2 = nio2 and is known up to a scalar constant. In fact, o/oi is 1/(ni)1/2. Therefore, premultiplying (5.15) by 1/(ni)1/2 and performing OLS on the transformed equation results in BLU estimators of a and в. In other words, BLU estimation reduces to performing OLS of Yi/(ni)1/2 on (ni)1/2 and Xi/(ni)1/2, without an intercept.
There may be other special cases in practice where oi is known up to a scalar, but in general, oi is usually unknown and will have to be estimated. This is hopeless with only n observations, since there are n oi’s, so we either have to have repeated observations, or know more about the oi’ s. Let us discuss these two cases.
Case 1: Repeated Observations
Suppose that ni households are selected randomly with income Xv for i = 1,2,…,m. For each household j = 1,2,…, ni, we observe its consumption expenditures on food, say Yij. The regression equation is
Yij = a + eXi + Uij i = 1, 2,…,m ; j = 1, 2,…,n (5.16)
where m is the number of income groups selected. Note that Xi has only one subscript, whereas Yij has two subscripts denoting the repeated observations on households with the same income Xv. The uij’ s are independently distributed (0, o2) reflecting the heteroskedasticity in consumption expenditures among the different income groups. In this case, there are n = Y^,T= ni observations and m o2’ s to be estimated. This is feasible, and there are two methods for estimating these o2’ s. The first is to compute
S’? = En= 1 (Yij – Yi)2/(ni – 1)
where Y = En=1 Yij/ni. The second is to compute й2 = YjL 1 e2j/ni where ej is the OLS
residual given by
eij — Yij CtQLS в OLS Xi
Both estimators of of are consistent. Substituting either sf or Zf for of in (5.12) will result in feasible estimators of a and (3. However, the resulting estimates are no longer BLUE. The substitution of the consistent estimators of of is justified on the basis that the resulting a and в estimates will be asymptotically efficient, see Chapter 9. Of course, this step could have been replaced by a regression of Yij/Zi on (1/Sj) and (Xi/ZZi) without a constant, or the similar regression in terms of ‘si. For this latter estimate, s2, one can iterate, i. e., obtaining new residuals based on the new regression estimates and therefore new sf. The process continues until the estimates obtained from the rth iteration do not differ from those of the (r + 1)th iteration in absolute value by more than a small arbitrary positive number chosen as the convergence criterion. Once the estimates converge, the final round estimators are the maximum likelihood estimators, see Oberhofer and Kmenta (1974).
Case 2: Assuming More Information on the Form of Heteroskedasticity
If we do not have repeated observations, it is hopeless to try and estimate n variances and a and в with only n observations. More structure on the form of heteroskedasticity is needed to estimate this model, but not necessarily to test it. Heteroskedasticity is more likely to occur with crosssection data where the observations may be on firms with different size. For example, a regression relating profits to sales might have heteroskedasticity, because larger firms have more resources to draw upon, can borrow more, invest more, and loose or gain more than smaller firms. Therefore, we expect the form of heteroskedasticity to be related to the size of the firm, which is reflected in this case by the regressor, sales, or some other variable that measures size, like assets. Hence, for this regression we can write of = o2Z2, where Zi denotes the sales or assets of firm i. Once again the form of heteroskedasticity is known up to a scalar constant and the BLU estimators of a and в can be obtained from (5.12), assuming Zi is known. Alternatively, one can run the regression of Yi/Zi on 1/Zi and Xi/Zi without a constant to get the same result. Special cases of Zi are Xi and E(Yi). (i) If Zi = Xi the regression becomes that of Yi/Xi on 1/Xi and a constant. Note that the regression coefficient of 1/Xi is the estimate of a, while the constant of the regression is now the estimate of в. But, is it possible to have ui uncorrelated with Xi when we are assuming var(ui) related to Xi? The answer is yes, as long as E(ui/Xi) = 0, i. e., the mean of ui is zero for every value of Xi, see Figure 3.4 of Chapter 3. This, in turn, implies that the overall mean of the ui’s is zero, i. e., E(ui) = 0 and that cov(Xi, ui) = 0. If the latter is not satisfied and say cov(Xi, ui) is positive, then large values of Xi imply large values of ui. This would mean that for these values of Xi, we have a nonzero mean for the corresponding ui’s. This contradicts E(ui/Xi) = 0. Hence, if E(ui/Xi) = 0, then cov(Xi, ui) = 0. (ii) If Zi = E(Yi) = a + вXi, then of is proportional to the population regression line, which is a linear function of a and в. Since the OLS estimates are consistent one can estimate E(Yi) by Z = aOLS + вoLsXi use Zi = Z instead of E(Yi). In other words, run the regression of Yi/Yi on 1/Yi and Xi/Yi without a constant. The resulting estimates are asymptotically efficient, see Amemiya (1973).
One can generalize of = o2Z2 to of = o2Zf where 8 is an unknown parameter to be estimated. Hence rather than estimating n of’s one has to estimate only o2 and 8. Assuming normality one can set up the likelihood function and derive the firstorder conditions by differentiating that likelihood with respect to a, в, o2 and 8. The resulting equations are highly nonlinear. Alternatively, one can search over possible values for 8 = 0,0.1,0.2,…, 4, and get the corresponding estimates of a, в, and o2 from the regression of Yi/ZSi/2 on 1/Z^J2 and Xi/ZSi/‘2 without a constant. This is done for every 6 and the value of the likelihood function is reported. Using this search procedure one can get the maximum value of the likelihood and corresponding to it the MLE of а, в, o2 and 6. Note that as 6 increases so does the degree of heteroskedasticity. Problem 4 asks the reader to compute the relative efficiency of the OLS estimator with respect to the BLU estimator for Zi = Xi for various values of 6. As expected the relative efficiency of the OLS estimator declines as the degree of heteroskedasticity increases.
One can also generalize of = o2Zf to include more Z variables. In fact, a general form of this multiplicative heteroskedasticity is
logo2 = logo2 + 6ilogZii + 6flogZfi + … + 6r logZri (5.17)
with r < n, otherwise one cannot estimate with n observations. Z1,Z2,…,Zr are known variables determining the heteroskedasticity. Note that if 62 = 63 = … = 6r = 0, we revert back to of = o2Zf, where 6 = 61. For the estimation of this general multiplicative form of heteroskedasticity, see Harvey (1976).
Another form for heteroskedasticity, is the additive form
o2 = a + biZii + 62 Zfi + … + br Zn (5.18)
where r < n, see Goldfeld and Quandt (1972). Special cases of (5.18) include
o2 = a + b1Xi + b2X2 (5.19)
where if a and b1 are zero we have a simple form of multiplicative heteroskedasticity. In order to estimate the regression model with additive heteroskedasticity of the type given in (5.19), one can get the OLS residuals, the ei’s, and run the following regression
e2 = a + b1Xi + bfX2 + Vi (5.20)
where vi = ef — of. The vi’s are heteroskedastic, and the OLS estimates of (5.20) yield the following estimates of oi2
(J2 = ‘a, OLS + b1,OLS Xi + b2,OLS Xi (5.21)
One can obtain a better estimate of the of’s by computing the following regression which corrects for the heteroskedasticity in the vi’s
(e2/Ji) = a(1/Ji) + b1(Xi/Ji) + b2(X2/Ji) + Wi (5.22)
The new estimates of o2 are
a2 = a + b1Xi + b2X2 (5.23)
where a, b1 and b2 are the OLS estimates from (5.22). Using the a2’s one can run the regression of Yi/’oi on (1/ai) and Xi/’oi without a constant to get asymptotically efficient estimates of a and в. These have the same asymptotic properties as the MLE estimators derived in Rutemiller and Bowers (1968), see Amemiya (1977) and Buse (1984). The problem with this iterative procedure is that there is no guarantee that the a2’s are positive, which means that the square root ai may not exist. This problem would not occur if o2 = (a + b1Xi + b2X2)2 because in this case one regresses ei on a constant, Xi and Xf and the predicted value from this regression would be an estimate of oi. It would not matter if this predictor is negative, because we do not have to take its square root and because its sign cancels in the OLS normal equations of the final regression of Yi/(ri on (1/Ji) and (Xi/Ji) without a constant.
Testing for Homoskedasticity
In the repeated observation’s case, one can perform Bartlett’s (1937) test. The null hypothesis is H0; of = a2 = … = a2m. Under the null there is one variance a2 which can be estimated by the pooled variance s2 = ^m=i viЩ/v where v = ^П=1 ui, and vi = ni — 1. Under the alternative hypothesis there are m different variances estimated by s2 for i = 1,2,…,m. The Likelihood Ratio test, which computes the ratio of the likelihoods under the null and alternative hypotheses, reduces to computing
B = [vlogs2 — Y, m=i Vi logs2]/c (5.24)
where c = 1 + [^m=1(1/vi) — 1/v] /3(m — 1). Under H0, B is distributed хПп1. Hence, a large pvalue for the Bstatistic given in (5.24) means that we do not reject homoskedasticity whereas, a small pvalue leads to rejection of H0 in favor of heteroskedasticity.
In case of no repeated observations, several tests exist in the literature. Among these are the following:
(1) Glejser’s (1969) Test: In this case one regresses ei on a constant and Zf for S = 1, —1,0.5 and —0.5. If the coefficient of Zf is significantly different from zero, this would lead to a rejection of homoskedasticity. The power of this test depends upon the true form of het – eroskedasticity. One important result however, is that this power is not seriously impaired if the wrong value of S is chosen, see Ali and Giaccotto (1984) who confirmed this result using extensive Monte Carlo experiments.
(2) The Goldfeld and Quandt (1965) Test: This is a simple and intuitive test. One orders the observations according to Xi and omits c central observations. Next, two regressions are run on the two separated sets of observations with (n — c)/2 observations in each. The c omitted observations separate the low value X’s from the high value X’s, and if heteroskedasticity exists and is related to Xi, the estimates of a2 reported from the two regressions should be different. Hence, the test statistic is s2/sf where sf and s2 are the Mean Square Error of the two regressions, respectively. Their ratio would be the same as that of the two residual sums of squares because the degrees of freedom of the two regressions are the same. This statistic is F – distributed with ((n — c)/2) — K degrees of freedom in the numerator as well as the denominator. The only remaining question for performing this test is the magnitude of c. Obviously, the larger c is, the more central observations are being omitted and the more confident we feel that the two samples are distant from each other. The loss of c observations should lead to loss of power. However, separating the two samples should give us more confidence that the two variances are in fact the same if we do not reject homoskedasticity. This trade off in power was studied by Goldfeld and Quandt using Monte Carlo experiments. Their results recommend the use of c = 8 for n = 30 and c = 16 for n = 60. This is a popular test, but assumes that we know how to order the heteroskedasticity. In this case, using Xi. But what if there are more than one regressor on the right hand side? In that case one can order the observations using Yj.
[r2(n — 2)/(1 — r2)]1/2 which is fdistributed with (n — 2) degrees of freedom. If this fstatistic has a large pvalue we do not reject homoskedasticity. Otherwise, we reject homoskedasticity in favor of heteroskedasticity.
(4) Harvey’s (1976) Multiplicative Heteroskedasticity Test: If heteroskedasticity is related to Xi, it looks like the Goldfeld and Quandt test or the Spearman rank correlation test would detect it, and the Glejser test would establish its form. In case the form of heteroskedasticity is of the multiplicative type, Harvey (1976) suggests the following test which rewrites (5.17) as
loge2 = loga2 + 81 logZu + … + 6r logZri + Vi (5.25)
where vi = log(e2/a2). This disturbance term has an asymptotic distribution that is logx2. This random variable has mean —1.2704 and variance 4.9348. Therefore, Harvey suggests performing the regression in (5.25) and testing H0; 81 = 82 = … = 8r = 0 by computing the regression sum of squares divided by 4.9348. This statistic is distributed asymptotically as %2. This is also asymptotically equivalent to an Ftest that tests for 81 = 82 = … = 8r = 0 in the regression given in (5.25). See the Ftest described in example 6 of Chapter 4.
(5) Breusch a, nd Pa, gan (1979) Test: If one knows that a2 = f (a + b1Z1 + b2Z2 +.. + brZr) but does not know the form of this function f, Breusch and Pagan (1979) suggest the following test for homoskedasticity, i. e., H0; b1 = b2 = … = br = 0. Compute <r2 = YRn=1 e2/n, which would be the MLE estimator of a2 under homoskedasticity. Run the regression of e2/<r2on the Z variables and a constant, and compute half the regression sum of squares. This statistic is distributed as x^. This is a more general test than the ones discussed earlier in that f does not have to be specified.
(6) White’s (1980) Test: Another general test for homoskedasticity where nothing is known about the form of this heteroskedasticity is suggested by White (1980). This test is based on the difference between the variance of the OLS estimates under homoskedasticity and that under heteroskedasticity. For the case of a simple regression with a constant, White shows that this test compares White’s var(/3OLS) given by (5.13) with the usual var(/3OLS) = s2/Y)7=1 X1 under homoskedasticity. This test reduces to running the regression of ei2 on a constant, Xi and X2 and computing nR2. This statistic is distributed as x2 under the null hypothesis of homoskedasticity. The degrees of freedom correspond to the number of regressors without the constant. If this statistic is not significant, then ei2 is not related to Xi and Xi2 and we can not reject that the variance is constant. Note that if there is no constant in the regression, we run ei2 on a constant and Xi2 only, i. e., Xi is no longer in this regression and the degree of freedom of the test is 1. In general, White’s test is based on running e2 on the crossproduct of all the X’s in the regression being estimated, computing nR2, and comparing it to the critical value of xI where r is the number of regressors in this last regression excluding the constant. For the case of two regressors, X2 and X3 and a constant, White’s test is again based on nR2 for the regression of e2 on a constant, X2,X3,X2^,X2X3,X‘^. This statistic is distributed as x2. White’s test is standard using EViews. After running the regression, click on residuals tests then choose White. This software gives the user a choice between including or excluding the crossproduct terms like X2X3 from the regression. This may be useful when there are many regressors.
A modified BreuschPagan test was suggested by Koenker (1981) and Koenker and Bassett (1982). This attempts to improve the power of the BreuschPagan test, and make it more robust
to the nonnormality of the disturbances. This amounts to multiplying the BreuschPagan statistic (half the regression sum of squares) by 2cr4, and dividing it by the second sample moment of the squared residuals, i. e., ^™=1 (e2 — a2)2/n, where a2 = ^n=i e2/n. Waldman (1983) showed that if the Zj’s in the BreuschPagan test are in fact the Xj’s and their crossproducts, as in White’s test, then this modified BreuschPagan test is exactly the nR2 statistic proposed by White.
White’s (1980) test for heteroskedasticity without specifying its form lead to further work on estimators that are more efficient than OLS while recognizing that the efficiency of GLS may not be achievable, see Cragg (1992). Adaptive estimators have been developed by Carroll (1982) and Robinson (1987). These estimators assume no particular form of heteroskedasticity but nevertheless have the same asymptotic distribution as GLS based on the true a2.
Many Monte Carlo experiments were performed to study the performance of these and other tests of homoskedasticity. One such extensive study is that of Ali and Giaccotto (1984). Six types of heteroskedasticity specifications were considered;
a2Xi (iii) a2
a2[E(l)]2 (vi) a2
and a2
Six data sets were considered, the first three were stationary and the last three were nonstationary (Stationary versus nonstationary regressors, are discussed in Chapter 14). Five models were entertained, starting with a model with one regressor and no intercept and finishing with a model with an intercept and 5 variables. Four types of distributions were imposed on the disturbances. These were normal, t, Cauchy and log normal. The first three are symmetric, but the last one is skewed. Three sample sizes were considered, n = 10, 25,40. Various correlations between the disturbances were also entertained. Among the tests considered were tests 1, 2, 5 and 6 discussed in this section. The results are too numerous to summarize, but some of the major findings are the following: (1) The power of these tests increased with sample size and trendy nature or the variability of the regressors. It also decreased with more regressors and for deviations from the normal distribution. The results were mostly erratic when the errors were autocorrelated. (2) There were ten distributionally robust tests using OLS residuals named TROB which were variants of Glejser’s, White’s and Bickel’s tests. The last one being a nonparametric test not considered in this chapter. These tests were robust to both longtailed and skewed distributions. (3) None of these tests has any significant power to detect heteroskedas – ticity which deviates substantially from the true underlying heteroskedasticity. For example, none of these tests was powerful in detecting heteroskedasticity of the sixth kind, i. e., a – = a2 for i < n/2 and a2 = 2a2 for i > n/2. In fact, the maximum power was 9%. (4) Ali and Giaccotto (1984) recommend any of the TROB tests for practical use. They note that the similarity among these tests is the use of squared residuals rather than the absolute value of the residuals. In fact, they argue that tests of the same form that use absolute value rather than squared residuals are likely to be nonrobust and lack power.
Empirical Example: For the Cigarette Consumption Data given in Table 3.2, the OLS regression yields:
logC = 4.30 — 1.34 logP + 0.17 logY R2 = 0.27
(0.91) (0.32) (0.20)
Figure 5.1 Plots of Residuals Versus Log Y 
Suspecting heteroskedasticity, we plotted the residuals from this regression versus logY in Figure 5.1. This figure shows the dispersion of the residuals to decrease with increasing logY. Next, we performed several tests for heteroskedasticity studied in this chapter. The first is Glejser’s (1969) test. We ran the following regressions:
ві = 1.16 – 0.22 logY (0.46) (0.10)
ei = 0.95+ 5.13 (logY)1 (0.47) (2.23)
ei = 2.00+ 4.65 (logY)0’5 (0.93) (2.04)
ei = 2.21 – 0.96 (logY)05 (0.93) (0.42)
The fstatistics on the slope coefficient in these regressions are 2.24, 2.30, 2.29 and 2.26, respectively. All are significant with pvalues of 0.03, 0.026, 0.027 and 0.029, respectively, indicating the rejection of homoskedasticity.
The second test is the Goldfeld and Quandt (1965) test. The observations are ordered according to logY and c = 12 central observations are omitted. Two regressions are run on the first and last 17 observations. The first regression yields sj = 0.04881 and the second regression yields sj = 0.01554. This is a test of equality of variances and it is based on the ratio of two x2 random variables with 17 – 3 = 14 degrees of freedom. In fact, sf/sj = 0.04881/0.01554 = 3.141 ~ +14Д4 under H0. This has a pvalue of 0.02 and rejects H0 at the 5% level. The third test is the Spearman rank correlation test. First one obtains the rank(logYi) and rank(ei) and
t = [r2(n — 2)/(1 — r2)]1/2 = 1.948. This is distributed as a t with 44 degrees of freedom. This tstatistic has a pvalue of 0.058.
The fourth test is Harvey’s (1976) multiplicative heteroskedasticity test which is based upon regressing log e2 on log(log Yi)
log e2 = 24.85 — 19.08 log(log Y)
(17.25) (11.03)
Harvey’s (1976) statistic divides the regression sum of squares which is 14.360 by 4.9348. This yields 2.91 which is asymptotically distributed as xl under the null. This has a pvalue of 0.088 and does not reject the null of homoskedasticity at the 5% significance level.
The fifth test is the Breusch and Pagan (1979) test which is based on the regression of e2/a2 (where a2 = Yli=1 e2/46 = 0.024968) on log Yi. The teststatistic is half the regression sum of squares = (10.971 A 2) = 5.485. This is distributed as xl under the null hypothesis. This has a pvalue of 0.019 and rejects the null of homoskedasticity. This can be generated with Stata after running the OLS regression reported above, and then issuing the command estat hettest lnrdi
.estat hettest lnrdi
BreuschPagan / CookWeisberg test for heteroskedasticity Ho: Constant variance Variables: lnrdi
chi2(1) = 5.49
Prob > chi2 = 0.0192
Finally, White’s (1980) test for heteroskedasticity is performed which is based on the regression of e2 on logP, logY, (logP)2, (logY)2, (logP)(logY) and a constant. This is shown in Table
5.1 using EViews. The teststatistic is nR2 = (46)(0.3404) = 15.66 which is distributed as %§. This has a pvalue of 0.008 and rejects the null of homoskedasticity. Except for Harvey’s test, all the tests performed indicate the presence of heteroskedasticity. This is true despite the fact that the data are in logs, and both consumption and income are expressed in per capita terms. White’s heteroskedasticityconsistent estimates of the variances are as follows:
logC = 4.30 — 1.34 logP + 0.17 logY (1.10) (0.34) (0.24)
These are given in Table 5.2 using EViews. Note that in this case all of the heteroskedasticity – consistent standard errors are larger than those reported using a standard OLS package, but this is not necessarily true for other data sets.
In section 5.4, we described the Jarque and Bera (1987) test for normality of the disturbances. For this cigarette consumption regression, Figure 5.2 gives the histogram of the residuals along with descriptive statistics of these residuals including their mean, median, skewness and kurtosis.
This is done using EViews. The measure of skewness S is estimated to be —0.184 and the measure of kurtosis к is estimated to be 2.875 yielding a JarqueBera statistic of
Table 5.1 White Heteroskedasticity Test

Table 5.2 White HeteroskedasticityConsistent Standard Errors







This is distributed as x2 under the null hypothesis of normality and has a pvalue of 0.865. Hence we do not reject that the distribution of the disturbances is symmetric and has a kurtosis of 3.
Leave a reply