Economics Letters 36 (1991) 175-179 North-Holland
Inappropriate in regression
use of seasonal dummies *
Tilak Abeysinghe National
University of Singapore,
26 November 1990 16 January 1991
Use of seasonal
series may lead to serious
1. Introduction The underlying assumption for using seasonal dummies in a regression is that the seasonality of the dependent variable is deterministic. Many economic variables show a regular seasonal pattern, with peaks occuring in the same season year after year. Such a regular seasonal pattern can also be a result of an integrated stochastic process. This paper examines some consequences of deterministic modelling of seasonally integrated series.
2. Seasonal non-station&y Consider the following seasonal processes: s
(1 - LJ)zr = (1 - L)S(L)z,
where s is the period (s = 4 for quarterly data), L is the lag operator, D’s are seasonal dummies, cI - (0, a*) and S(L) = 1 + L + _. . + Ls-‘. S(L) contains (s - 1) seasonal unit roots. The seasonality in model (1) is deterministic (fixed) while that of (2) is integrated and stochastic. A stationary stochastic seasonal component could be added to both (1) and (2). It is not difficult to show that the solution to (2) has a deterministic component similar to that of (7) [Abraham and Box (1978)]. The two components differ, however, by the fact that in (1) the coefficients are fixed parameters, whereas in (2) they are adaptive and determined by the starting values. The presence of the zero-frequency unit root factor (1 - L) in (2) makes the level of the series * I would like to thank 0165-1765/91/$03.50
and Sam Ouliaris
0 1991 - Elsevier Science Publishers
on an earlier
of this paper.
T. Abeysinghe / Inappropriate use of seasonaldummies in regression
move in a random manner. Even if (1 - L) is removed from (2) so that the process becomes pure seasonal, the solution to (2) still contains a deterministic component provided that S(L) has at least one unit root. The most important difference between (1) and the solution to (2) is that the innovation process of the latter is an integrated one. [See Hylleberg, Engle, Granger, and Yoo (1990) (henceforth HEGY) for such solutions in the case of quarterly series.] Despite these differences (2) may still give rise to series with a very regular seasonal pattern which may persist over a long period of time. The requirement for this to happen is that the seasonal variation of the starting values should be sufficiently large relative to the variance of the innovation term. Most often this requirement is met by many economic time series. If a series is seasonally integrated like (2) subtraction of seasonal means from the series (i.e., deseasonalizing using a seasonal dummy regression) does not remove the seasonal non-stationarity of the series. Rather, the means-subtraction simply creates an artificial series. Consider, for example, the biannual process, z, =
to (3a) can be written
where t ’ = int[( t + 1)/2] and i = 1 if t is odd and seasonal means of this series are:
i = 0 if t is even.
N years of data
w, = zt - A, D,,
series wt is
where t’ and i are defined as above. This series is a weighted sum of N innovations. It is not difficult to derive cov(w,, w,_~) from (3~) and it can be shown that corr(w,, w(_~) = 1 for large N relative to k. The covariance structure of w, has changed compared to that of (3b) as a result of the subtraction of seasonal means. The detection of seasonal non-stationarity of the means-subtracted series may become difficult in practice if the generating process also contains a stationary stochastic seasonal component. Simulation results based on data generated from the model A4z, = (1 - 0L4)c,,
T. Abeysinghe / Inappropriate use of seasonal dummies in regression
show that when 0 is neither close to one nor to zero the autocorrelogram of the means subtracted series, r-v,, takes the appearence of a stationary series in contrast to the correlogram of t, itself. [Details are given in Abeysinghe (1990).] As we shall see later, the assumption that w, is stationary may lead to spurious regressions. It should be noted, however, that as 0 + 1 near cancellation of factors in (4) takes place, and (4) reduces to a deterministic seasonal model [Abraham and Box (1978)].
3. Spurious regression phenomenon Often the regression model s-1
c B,D,,+ YX, + u,>
is formulated to test the hypothesis Ha: y = 0. Love11 (1963) has shown that (5) and the regression
produce identical OLS estimates of y and u,. In (6) y’ and x’ are deseasonalized series using OLS regressions of y and x on seasonal dummies. Note that if yI and x, and seasonally integrated series then y,’ and x: will also be seasonally integrated like the W, series derived earlier. The regressions (5) or (6) are, therefore, very likely to produce spurious results. Some Monte Carlo results are reported below to highlight the magnitude of the spurious regression problem. The simulation is carried out with quarterly series. As S(L) = (1 + L*)(l + L) three seasonal cycles are possible [HEGY (1990)]: 1. Frequency
5-r/2 (one cycle per year) (1 + P)z,
7~ (two cycles per year) (1 + L)z, = Cf.
of 7~/2 and n
S( L)z, = ct. These three processes are used to generate y, and x, independently of each other using N(0, 1) innovations and zero initial values. As y, and x, are independent processes, y = 0 in the regressions (5) and (6). Spurious regression exists, therefore, if H,: y = 0 is rejected. This hypothesis is tested against Hi: y # 0 by fitting OLS regressions to (5) and computing the standard t ratios. The frequencies of rejecting H, y = 0 at the 5 percent level based on 500 independent replications are reported in table 1. The results in table 1 are only suggestive, since in practice more complicated seasonal structures could occur. As can be seen from the table spurious regression is less likely when y, and x, are
use of seasonal dummies in regression
Table 1 Frequencies (a) of rejecting the true hypothesis that y and x are unrelated used on seasonally integrated series. Quarterly series, 500 replications.
&2, n) Number
when OLS regressions
34, 0, 054
sizes (60, 100). The nominal
level of significance
used is 5%.
integrated at different frequencies. This is as expected since innovations in x, and ut in (5) occur at different time lags. Similar results should hold when seasonality in x is deterministic. The rejection frequencies are in general higher when the two series are integrated at the same frequency. The rejection frequencies increase as sample size increases. The increasing of the sample size, therefore, does not solve the problem of spurious regression.
4. An application As an application the U.K. non-durable consumption expenditure series (quarterly, 1955.1-1988.11, at 1987 prices) is examined below. The log consumption shows a very regular seasonal pattern [see HEGY (1990) for a plot]. Despite this regularity HEGY find that the log consumption is integrated at zero as well as seasonal frequencies. An ARIMA model of order (0, 1, O)(O, 1, 1)4 fits this series well. The ML estimate of 0 is 0.62. Fitting a structural time series model with exact ML yields g2(seasonal) = 3.7 x 10e6 with f-ratio = 2.54 [see Harvey (1989)]. All these indicate that the seasonality of the log consumption is best modelled as stochastic. Following Pierce (1978) a deterministic seasonal model can be constructed as 4
where y, is log consumption. The possibly a seasonal AR(2) process. 0 = 0.62. To examine the appropriateness was added to (10) and reestimated
autocorrelogram of wr seems to suggest that W, is stationary Following the arguments of Section 2 this can be expected as of deterministic modelling, an artificial assuming two error structures:
W,= @,W,_, +
The x, series was generated by one of (7) to (9). Based on 50 independent replications of x, in each case, we observed the following frequencies (%) of rejecting the true hypothesis of no relation at the 5% level. The estimation was carried out with AUTOREG in SAS.
are by means achieve satisfactory
small. indicate Alternatively x,
(ii) These necessary
use of seasonal dummies in regression
long lag not be seasonally
for W, integrated
5. Conclusion In many cases deterministic seasonal models combined with a stationary stochastic seasonal component may provide a good approximation. However, if a series is in fact integrated, starting with a deterministic model is more likely to produce spurious results than to uncover the integratedness of a series.
References Abeysinghe, T., 1990, Inappropriate use of seasonal dummies in regression, Working paper no. 2 (National University of Singapore, Singapore). Abraham, B. and G.E.P. Box, Deterministic and forecast-adaptive time dependent models, Applied Statistics 27, 120-130. Harvey, A.C., 1989, Forecasting structural time series models and the Kalman filter (Cambridge University Press, Cambridge). Hylleberg, S., R.F. Engle, C.W.J. Granger and B.S. Yoo, 1990, Seasonal integration and cointegration, Journal of Econometrics 44, 215-238. Lovell, M.C., 1963, Seasonal adjustment of economic time series, Journal of the American Statistical Association 58, 993-1010. Prerce, D.A. 1978, Seasonal adjustment when both deterministic and stochastic seasonality are present, in: A. Zellner, ed., Seasonal analysis of economic tim e series (U.S. Department of Commerce, Washington, DC) 242-269.