JOURNAL OF
Econometrics Journal
EIXWIER
of Econometrics
68 (1995) 303338
Estimating the asymptotic covariance matrix for quantile regression models A Monte Carlo study Moshe Buchinsky Department
of Economics, Yale University. New Haven, CT 065208264, USA (Received
July 1991; final version
received April 1994)
Abstract This Monte Carlo study examines several estimation procedures of the asymptotic covariance matrix in the quantile and censored quantile regression models: design matrix bootstrap, error bootstrapping, order statistic, sigma bootstrap, homoskedastic kernel. and heteroskedastic kernel. The Monte Carlo samples are drawn from two alternative data sets: (a) the unaltered Current Population Survey (CPS) for 1987 and (b) this CPS data with independence between error term and regressors imposed. This special setup allows one to evaluate the estimators under various realistic scenarios. The results favor the design bootstrap for the general case, but also support the order statistic when the error term is independent of the regressors. Key words: Quantile and censored quantile JEL classification: C13; C14; C15; C24
regression;
Asymptotic
covariance
matrix
1. Introduction
Quantile regression models have gained considerable interest recently, especially in theoretical discussions. The availability of efficient linear programming algorithms together with the rapid development of computers have opened the
This paper is, in part, from chapter 2 of my dissertation ‘The Theory and Practice of Quantile Regression’, Harvard University, 1991. I wish to thank Don Andrews, Joshua Angrist, Chris Cavanagh, Paul Gunther, Bo Honor& three anonymous referees, and especially Jim Powell for many discussions and comments. Comments made by the participants in various universities’ Gary Chamberlain error is, of course,
03044076/95/$09.50 SSDl 030440769401652
mine.
1995 Elsevier Science S.A.
304
M. BuchinskylJournal
of Econometrics 68 (1995) 303338
gates for more common use of such models in empirical applications. Of special interest in estimation of the asymptotic covariance matrix for quantile regression estimators. This paper reports on a Monte Carlo study, using real data, of various order statistic, bootstrap, and kernel estimators for both the quantile and the censored quantile regression models. Much recent literature has been devoted to the theoretical aspects of quantile regression estimators, and in general, robust estimators.’ Nevertheless, little attention has been given to the practical problems of determining the estimators’ precision. In limited experiments relating to median regression Dielman and Pfaffenberger (1986,1988) compared the nominal size and power of a Wald test to the empirical Monte Carlo size and power, while Stangerhaus (1987) investigated aspects related to computation of confidence intervals. A common assumption in these studies in that the error (uO) density function at zero is independent of the covariate vector x (i.e.,fU,(0)x) =f.,(O) for all x). Under this assumption the regressors and the error term for the Monte Carlo samples were drawn independently from hypothetical distributions. In addition, the dependent variable y was constructed from x,
[email protected], and an assumed coefficient vector Be. No investigation have been made into practical problems associated with estimation in the more general case, whenf,,(Olx) #f&(O). In this study the Monte Carlo samples are not drawn from a hypothetical distribution but rather from the actual Current Population Survey (CPS) for 1987. Moreover, samples are drawn from the joint distribution of x and y, so that the distribution of the error term is that which is actually contained in the data. One can thereby investigate several issues significant for empirical application. Various techniques that are consistent in the general case can be examined for an actual important data set. In addition, since the CPS data on wages are available only in a discrete form, one can examine the importance in practice of the continuity assumption for the error term density. Finally, we can also examine practical issues relating, for example, to the bootstrap sample size, the kernel estimators bandwidths, etc. The paper is organized as follows: Section 2 briefly introduces the quantile and censored quantile regression models. The estimators examined in this study are presented in Section 3. The Monte Carlo simulation setup and explanations about some subjective choices for the order statistic and the bootstrap estimators are discussed in Section 4. Section 5 studies the nature of the heteroskedasticity in the CPS data set. The results of the Monte Carlo simulations are presented and discussed in Section 6, while Section 7 discusses the standard errors for the various estimation schemes. Section 8 presents conclusions and remarks.
1 See, for example, studies by Bickel (1973), Carroll (1989), and Newey and Powell (1990).
and Ruppert
(1984), Portnoy
and Koenker
M.
BuchinskylJournal
oj’ Econometrics
68 (1995)
303338
305
2. Quantile and censored quantile regressions
2.1. Quantile reyression The conditional written as
quantile of Koenker
and Bassett’s (1978) model can be
Qe(Yilxi) = X~PO, i = 1, . . . ,n,
(1)
where fi8 and xi are K x 1 vectors and Xii = 1. The error us = y  x’/& is assumed to have a continuously differentiable c.d.f., F,,(  1x), and density functionf,,( . Ix). An estimator for be is obtained from
minf ,C i4dyi 8
.db),
(2)
I1
where
[email protected](I)= (0  I(,? < O)),? is the check function and I(A) is the usual indicator function. The problem in (2) is a linear programming problem (see, e.g.. Koenker and Bassett, 1978). Powell (1984,1986) showed that (2) fits into the generalized method of moments (GMM) framework for the censored quantile regression model, under Huber’s (1967) conditions. In particular one can show that
where /10 = U(1  H)(E[,f,,(0~x)xx’])‘E[.~x’](E[f~,(0~.~).~x’])‘. If the density of ue, at 0, is independent simplifies to A
=
0
et1  0)(E[xx’])’
f’o
= o;(E[xx’])
of x, i.e., jJ0l.x)
(3 =&(O), then &
‘,
U# where fI(l  0) cH2=m.
(5)
2.2. Censored quantile regression The censored quantile regression model, which allows one to deal with censored data, was proposed by Powell (1984,1986) and can be written similarly
M. BuchinskylJournal
306
of Econometrics 68 (1995j 303338
to (1) as
[email protected](ylx) = min(y’, x’Pe), where y” is the censoring value. A consistent estimator rffefor /$ is obtained as a solution to mjn $ ,i
PO(Yi

min(y”,
XXV).
(6)
1l
If follows that
where /I; = 0(1  G)(E[fU,(O(x)1(x’/3, < yO)xx’])’ E[r(x’/?, < y’)xx’] x (ECfue(Olx)l(x’Be <
Y~).=‘I)
‘.
(7)
= af(E[Z(x’/& < y’)xx’])‘.
(8)
If~U,(Olx)=,fu,(O), then n: simplifies to /ip = &l  0) (E[~(x’/$ < yO)xx’])’ 0 f’ 4
Since min (y’, x$) is not linear in /I, the problem in (6) is not an LP problem. Nonetheless, an LP algorithm  which I call the iterative linear programming algorithm (ILPA)  can be used in the following manner: In thejth iteration we solve BP’ using only the observations for which x$r ‘) < Y’.~ The algorithm is terminated when the sets of observations in two consecutive iterations are the same. Convergence to a solution is not guaranteed, but if it is achieved then a local minimum is obtained.3
3. Estimators for the asymptotic covariance matrix Several estimators for the asymptotic covariance matrix for
[email protected] examined: (a) order statistic, (b) design matrix bootstrap, (c) error bootstrapping, (d) sigma bootstrapping, (e) general kernel, and (f) homoskedastic kernel, with several variations for each of these estimators.
’ Dropping the observations for which
[email protected] 2 y0 is of no consequence y,  y’, which does not depend on B8.
since yi  min(ya, xi/&) =
3 For a proof see Buchinsky (1991). Typically, for the CPS data set, convergence has occurred in five to twenty iterations regardless of the starting value pi. The starting value in all the estimations here is the median regression estimate, with the constant properly adjusted. Similar algorithm was suggested by Osborne and Watson (1971). Other algorithms for nonlinear I, regression were suggested by Womersley (1986) and Koenker and Park (1993).
M. Buchinsky/Journal
of Econometrics 68 (1995) 303338
307
3.1. Order statistic estimator (OS) This estimator is valid only under the independence assumption (i.e., for the covariance matrices in (4) and (8)). An estimator for ai in (5) can be obtained by matching an exact confidence interval for the 0th quantile, pe, from a binomial distribution, with a confidence interval using a normal approximation. From the binomial distribution B(n, Q) and its limit for large n we have Pr(U,,, d pe d U,,,) = Pr(s < XB < t)
where X,  B(n,@, U(,,,) denotes the sample mthorder statistic, s = [no  11, t = [nH + 11, and [Ii] is the integer part of A.Denoting by Z, the qth quantile of a standard normal variable, the normal approximation implies that for a symmetric 1  a confidence interval Pr& P,
 (acne lj2Z 1 s/2 @(Z1
a,2)

@(
G
z1
ps
d
Ofj +
o(Jn“2z,
_a,2)
a/2).
(10)
Equating the two confidence intervals from (9) and (10) gives an estimate for 02:4 II s; = n( O#)  0,,,)“/4z:
n
and
I= Z1 _a,2Jm,
(11)
where UCrnjis the mthorder statistic of r&i, . ,u*o,.5 Asymptotically the choice of Zr a,2 does not matter, but it does affect S,’ in small samples. In this study I use three values of Z1_o,2: 1.65, 1.96, and 2.57, corresponding to 1  a confidence intervals of 0.90, 0.95, and 0.99, respectively.
4An estimate & is provided then by A, = c?:((l/n)~“, ,x,x:)) ‘. For the censored quantile regression model an estimate S& for 0: is based only on the observations for which xi/$ i y” and 2,’ = B&((l/n)C:=, I(x{JO < y”)xixl)‘. ‘This estimator (see Huber, 1981) employs differences between two order statistics to estimate the derivative dFU;‘(t)/dt = l/f.,(FU;‘(t)). Estimating this reciprocal density at a point using order statistic was suggested by Siddiqui (1960) and investigated by Bloch and Gastwirth (1968) Bofinger (1975), Sheather and Maritz (1983) and Hall and Sheather (1988). This literature supports an for the estimator used here is r~r’~. optimal bandwidth rate of n“‘, while the rate of the bandwidth Koenker and Bassett (1982b) provide an alternative method for taking a discrete derivative of the empirical quantile function.
308
M. Buchinsky/Journal
of Econometrics
68 (1995) 303338
3.2. Design matrix bootstrap (DMB) estimator
This type of estimator, which was suggested initially by Efron (1979,1982) and attracted many researchers, is valid for the general case (i.e., for covariance matrices given in (3) and (7)).6 In this method an estimate for /ie is computed by
(12) where fly’, . . ,Bf’ are the B bootstrap estimates for be, for the B samples (each of size m) drawn from FnXY,the empirical joint distribution of x and y, and /Ii is some pivotal vector.’ Two alternative pivotal values are considered here: (a) /?;1= PO,,the estimate for the original Monte Carlo sample (denoted by DMBE), and (b) Bi = (l/S)?;= rflg’, the average of the bootstrap estimates (denoted by DMBA). In addttlon, the direct percentile method (referred to as DMBP) is considered. In this method the upper and lower bounds of a 1  5 confidence interval are taken to be  element by element  the [X/2] and [B(l  t/2)] order statistics, respectively, of the bootstrap estimates. 3.3. Error bootstrapping
(EB)
The consistency of this estimation procedure relies on the independence assumption. Instead of resampling fromf,,, it is based on separate resampling of m observations e: and XT (j = 1, . ,m) from the empirical distribution functions of ue, F,,;, (based on GO,, . . . , Con,u*eiE yi  xi/&), and of x, F,,,, respectively. The dependent variable is computed by y; = xf’/$ + e;. Three alternative computation methods, identical to those for the DMB procedures, are employed here as well. These are denoted by EBE (for /I: = Be), EBA (for the average of the bootstrap estimates), and EBP (for the percentile method). In the censored quantile regression model F,,,( .) cannot be estimated in the usual way since some of the residuals u*8iare censored. Instead, I use Kaplan and
‘jThere is little theoretical justification for the use of the bootstrap method in econometric models. The bootstrap method for quantile regression was first implemented by Buchinsky (1994) and was adopted also by Chamberlain (1994). Several authors considered bootstrapping the sample quantile (e.g., Efron, 1979, 1982; Bickel and Freedman, 1981; Singh, 1981; Lo, 1989). Hahn (1992) proved the consistency of the bootstrap estimator using the percentile method described below. Buchinsky (1992) justified the use of the bootstrap method for the special case of discrete x’s, Justification of bootstrap estimator for the mean regression is provided in Freedman (1981), while Yang (1985) considered a general class of differentiable functionals. ‘The resampling schemes for the quantile and censored quantile regression models are essentially the same. The only difference is the manner in which the bootstrap estimates are obtained.
M.
BuchinskylJournal
cf Econometrics
68 (1995)
303338
309
Meier’s (1985) consistent estimator F,;,(t) = 1 
n
(1  Ij),
Uj
where 5 = nj/rj. nj is the number of uncensored observations  possibly only one  with I&, = Uj, and rj is the total number of observations for which tiei d Uj. 3.4, Sigma bootstrap estimator (SBE) This estimator also relies on the independence assumption. Here, ai in (4) and (8) is estimated directly using a bootstrap method. The estimate is given by (13) where & is the 0th quantile of r&r, . . . , lion and Gr), . . . ,Gp’ are B bootstrap estimates from B samples (each of size m) drawn from F,tB.’ 3.5. General kernel (GK) estimator Powell (1986) considered a oneside uniform kernel estimator for the general case covariance matrix. The estimates for the terms of the covariance matrix in (7) are provided by d^, = t ,$ rl
Z(Xifie
< yO)XiXi
(14)
for EII(x’bB CyO)xx’], and Afx =
(C,n)’
i
I(X:bo
<
yO)Z(O< fe, d
C,)XiXi
(15)
i=l
for E[l(x’be < y”)fU,(O1x)xx’], where c, = op(l) is the kernel ‘bandwidth’.’ The present study also considers an alternative normal kernel estimator given by n
d,,
= dLL 2
Z(X$~
<
y’)exp{  U^~,/(~C~)}~(L&,
3 0)X,X;,
(16)
i=l
*Note that by the linear programming representation of quantile regression, Q0 = 0 always. An estimator for A, is provided by /i, = B~~((l/n)C”_,xixl)~‘. For the censord quantile regression model F,, is constructed using the KaplanMeier procedure, and 2: = c?&,((l/n) x ^ I;=, f(x:j?o < yO)xix;) ‘. 9This estimator can be easily modified to accommodate the quantile regression model, merely by dropping the indicator function I(xifiO < y’) from the formulas in (14) and (15).
M. BuchinskylJournal
310
of Econometrics 68 (1995) 303338
where d, = nc,$/2. For both the uniform and normal kernels, twoside estimators are also considered wherein the indicator function in (15) is replaced with I(  42 < &, d c,/2), the indicator function in (16) is dropped, and d, in (16) is now d, = nc,,/‘%. 3.6. Homoskedastic
kernel (HK) estimator
Under the independence assumption one can estimatef,,(O) and substitute into 0; in (5) to get an estimate for As, A, = (tI(1 [email protected]/j;(O)) (6 ’ Cf= 1 Xix;) ‘. Powell (1986) suggested as a consistent oneside uniform kernel estimator for _L,(O),
i,(O)= c, l
ljl m$, < (i$lwB8
yO)Z(O < lie, <
CJ,
for the censored quantile regression model. lo Both one and twoside uniform and normal kernel estimators are investigated in this study.
4. The Monte Carlo simulation
setup
4. I. ‘Population ’ sample
The Monte Carlo samples are drawn from the actual 1987 CPS sample (containing 75,578 observations) of all outgoing rotation groups of adult white males. This special situation enables one to compare small sample estimates with their ‘population’ counterparts. For this study I computed the (K x 1) coefficient vectors /[email protected] a wage equation at the 0.10, 0.25, 0.50, 0.75, and 0.90 quantiles. The log of usual weekly earnings is the dependent variable, and the independent variables consist of: (i) a constant, (ii) education, (iii) potential experience, and (iv) potential experience squared.’ ’ The five ‘population’ coefficient vectors were estimated twice. First a quantile regression was estimated, using the Barrodale and Roberts (1973) algorithm, assuming no censoring problem.” Secondly, in order to study the effect of censoring, an extreme censoring value of $750 was artificially imposed on all
“For
the quantile
regression
model the estimator
is&”
= (c.n)) ‘~~=, I(0 < lie, < c.).
r’ Education = higher grade attended  1  I(last grade was not completed), where indicator function. Potential experience = min(age  education  6, age  18). 12The BarrodaleeRoberts algorithm quantile (e.g., Koenker and D’Orey,
I
is for the median 1987).
regression,
I( .) is the
but is easily modified
for any
M. BuchinskylJournal
of Econometrics 68 (1995) 303338
311
weekly earnings of $750 or more.i3 Censoring then becomes a severe problem, even for low quantiles. The censored quantile regression is estimated using the ILPA. 4.2. Monte
Carlo
simulations
scheme
In this definition of the ‘population’ and the ‘population parameters’ the distribution of the error term is not assumed, but rather is given its actual value. Consequently, the error term ue is not independent, in general, of the regressors x (see the next section). In order to examine the performance of the alternative methods under different assumed relationships between the error term and the regressors, two sets of simulations are carried out. In the first set, the Monte Carlo samples, each of size n, are drawn from the (x, y) pairs of the ‘CPS population’; in this case u0 depends on x. In the second set, the Monte Carlo samples are drawn from the set of x’s and t+,‘s independently, while the y’s are computed from x, uO,and the population parameters. While ug is independe,nt of x, its marginal distribution remains the same as for the first case, i.e., the empirical distribution of the ‘population errors’ r&i, . . . , tie,. Under both schemes a coefficient vector is estimated for the Monte Carlo sample, along with covariance matrices using each of the alternative procedures. I then construct 0.95 confidence intervals for each population parameter using each of the above estimated standard errors and check whether or not it contains the corresponding population parameter. This procedure is repeated N,,,, times. The numbers reported in each of the tables below give the empirical levels of the Monte Carlo simulations, i.e., the fraction of times in which the computed confidence intervals contained a particular population coefficient.14 Since the Monte Carlo repetitions are independent, the variance of an empirical level filk is
var(hk)= j+
plk t1  plk),
(17)
mc
for every estimator I (I = 1, . . . , L) and every population coefficient fiOk(k = 1> ... 3K). Standard errors for typical empirical levels and Monte Carlo samples sizes are reported in Table 12.
I3 The actual censoring value in the CPS data set is $999, with 5,474 observations (7.2% of the total number of observations) having this value as their usual weekly earnings. 12,196 observations (16.1%) take the value $750 after the artificial censoring at that value is imposed. t4The confidence intervals are computed separately regions for the population parameter vector.
for each coefficient;
these are not confidence
M. BuchinskylJournal
312
of Econometrics 68 (1995) 303338
4.3. Choosing o!for the order statistic estimator
To illustrate the impact of the choice of c( on the estimate ag, a sample of 500 observations was randomly drawn from the CPS sample (with a censoring value artificially set at $750). ce is then estimated for values of Z1 _a,2 ranging from 1.5 to 2.7 (corresponding to 1  c( ranging from 0.86 to 0.99), at the 0.10,0.50, and 0.90 quantiles. The estimates of de for 8 = 0.10, 0.50, 0.90 graphed in Figs. lalc show the sensitivity of ciOto the confidence interval level.15 At the 0.10 quantile the 8,‘s range from 1.02 to slightly over 1.22. This variation is even larger at the 0.90 quantile, where the &‘s differ by a factor of about 1.5. Even for confidence intervals of 0.95 to 0.99 (Z, _n,2 = 1.96 to 2.7) the variability in rie is quite significant, especially at the extreme quantiles. While censoring creates a problem at the 0.90 quantile, and to lesser extent at the 0.50 quantile, it causes no problem at the 0.10 quantile. Therefore, the sensitivity of the [email protected] at the 0.90 quantile cannot be solely attributed to censoring, although it is certainly magnified by it. 4.4. Choosing the number of bootstrap repetitions
When using the bootstrap technique one needs to choose: (a) the number of repetitions B and (b) the bootstrap sample size m. In the current study only the effect of the latter choice is examined, while the number of repetitions is based on a preliminary examination of the behavior of the bootstrap estimates. In the following example I use a sample of size 100 drawn from the 1987 annual CPS (with top coding value set at $7501, and I estimate the coefficient vector and a set of 150 bootstrap estimates (each of size 100). Note that with B bootstrap estimates at hand, one can also estimate the covariance matrix for /?e, using the fourthorder moments. Denote each element of the covariance matrix estimate in (12) (divided by n) by Vj, that is,
Let Ujdenote the vector of the stacked columns of the lower triangular matrix of Vj, i.e., Uj= vecl(Vj), and let u = vecl(%(/&)). Then an estimate of the covariance matrix of the covariance matrix estimate is given by
Gi
(vG(be)) =
$
,e  (Uj
U)(Uj
U)‘.
(19)
J1
I5 Note from (11) that for s = [no  I] and t = [no + j] the numerator thresholds as G(increases, while the denominator increases continuously the ‘ratchet effect’ observed in Figs. lalc.
of $8’ changes only at certain as a increases. This leads to
M. Buchinsky/.lournal
of Econometrics
68 (1995) 3035338
a. JO Quantllo ,24Slomo
I 22

1.20 
I ooruILtLLucLuL:“‘~I~‘~~!“~‘:“‘~~’~~~:~~~ I5
lb
17
I8
IV
Siondora
2 norm01
2.1
22
2.3
dlslrtb~llon
24
25
26
25
26
values
b.30 Quantile 0.62
Sigma
0.60 0.58 
Standard
normal
dlstrlbullon
values
o.Jellll~:1111:~~~~~:~~~~:~~~~~ 1.5
I.6
I7
I8
l.V
Slondord
Fig. 1. vg estimates
for censored
2 normal
21
2.2
23
dlnrloutlon
quantile
regression,
24
values
with sample size 500.
313
314
M. BuchinskyJJournalof Econometrics68 (1995) 303338
Since the standard error is a simple function of the variance, one can use the delta method to show that the asymptotic standard error, se,( .), of the estimated standard error,3 (  ), for an element in &, is given by
(20) where va( .) denotes the asymptotic variance and var( .) denotes the variance. Figs. 2a2d depict the standard errors, along with their standard errors, as functions of the number of bootstrap repetitions for two coefficients at the 0.50 and 0.90 quantiles. Two important points are clear: (a) the bootstrap estimates stabilize quickly and have relatively small standard errors, and (b) the bootstrap estimates are smoother and stabilize faster when censoring is not a problem (Figs. 2a and 2c) than when it presents a severe problem (Figs. 2b and 2d). Based on preliminary examination of the bootstrap estimator for various sample sizes, the following rule was adopted in all the Monte Carlo experiments: 100 repetitions are carried out whenever the bootstrap sample size is less than 500 observations, and 50 repetitions for 500 or more observations.
5. Characterization 5.1.
of the heteroskedasticity
in the CPS data
General test for heteroskedasticity
When the error term ue is independent of x, the slope coefficients at different quantiles should be the same. Equality of the slope coefficients at the five estimated quantiles can be tested using the minimum distance (MD) framework. Under the null hypothesis of equality among the slope coefficients, the MD statistic has an asymptotic X2distribution, i.e., X2 = n(fly+
Gfl”)‘Apl(fl
x2((J l)(k
Gf?‘)
1))
(21)
wherep =(/I&, . . . , /I;,)’ is a stacked vector of J unrestricted parameter vectors, fi is its estimate, B” = (G’,?, ‘G)) r G’ ‘fl is the efficient MD estimator for the restricted parameter vector /3” = (/Ior, . . . , &, p2, . . , /Ik)‘, G is a suitable restriction matrix, and A, is a consistent estimator for AP, the covariance matrix of the unrestricted estimate fl. For two alternative order statistic estimates for A,, corresponding to CI= 0.05 and c( = 0.01, the x2 statistics were 2937.6 and 3198.5, respectively, well above any reasonable critical value. This clearly rejects the null hypothesis, indicating
;ia
0.022
22
22
0.027

35
St. Error
46
30
62
54

4b
Error
70
 SI. Error
reoettttons *
78
l
04
St. Errol
.Sb 102
70
quantile
 St. Error
01 re~lllOnl
b2
l
94
IO2
bootstrap
St. Error
Sb
repression.

78
Coefficient, .90 Ghantile
NO

No. 01
54
Fig. 2. Censored
St.
38
c. Experience
30
a. Education Coefficient, .90 Qnantile
estimates
110
110

0.02

38
2.1 Error
46
30
St.
38
62
70
 SI. Error

with sample
 St. Error
52 70 NO. of rq,etfto”s
sic
+
*
75
78
Coefficient,

NO. 01 re*tttlonr
S4
54
coellicients.
Error
46
d. Experience
30
for difkrcnt
22
+
0.04
0.141
22
b. Education Coefficient.
94
102
94
100.
+ St. Error
a5
102
SO Quantilo
f SI. Error
.%,
50 Quantile
110
110
316
M.
BuchinskylJournal of Econometrics68 (1995) 303338
that the distribution regressors.
of the error term ue depends significantly on the set of
5.2. KoenkerBassett
model of heteroskedasticity
In view of this result one would like to obtain information about the degree of heteroskedasticity in the CPS data for a particular model. For this purpose I employ the multiplicative heteroskedasticity model of Koenker and Bassett (1982a) wherein y = x’/I + u = x’fl + (T(X)&,and E is i.i.d. error independent of x.16 For a(x) = ( 1 + x’y) we get y = x’s + (1 + x’y)(uO + Q&)), where ue = E  [email protected](E). The conditional quantile can then be written as Qe(~lx)
= x’P + Qe(ulx)
= x’(D + YQ~(E)) + Qe(E) = x’de,
(23
where 60 = fi + (y + ei)Qe(s) and e; = (l,O, . . . ,O). In this setup /?I and y1 cannot be separately identified, as is also the case with the Qej(s)‘s. I therefore set y1 = 0, Q0.50(~)= 0, and estimate the remaining coefficients using the MD framework. Let 6(p) = (Sk,, . . . ,S&)‘, ,u = (PI, ... ,Pk,~z, ... ,YkYQe,(E), ... 9Qe,(&))‘3 where Q,,50 = 0. An efficient estimator for p is obtained from
min(Bmatrix is given by
As can be clearly seen from Table 1, the estimates large, and undoubtedly significant. This strongly suggests significant linear quantile regression specified
The results
This section evaluates
I6 I wish
M. Buchinsky/Journal Table 1 KoenkerBassett
heteroskedasticity
Coefficient
of Econometrics 68 (1995) 303338
model
Point estimate
Standard
and standard
error
0.912 0.06 1 0.045 0.001 0.072 0.058 0.001 1.338 0.630 0.544 0.925
431.275 9.061 5.646 0.097  1.025 2.261 0.05 1 99.61 I 46.943 39.834 12.660 Note: The coefficients
317
errors
are multiplied
fstatistic 473.16 149.62 126.89  100.09  14.22  38.87 37.34  74.46  74.57 73.19 78.60
by 100.
the first data set, given the heteroskedasticity in the CPS data, only the design matrix and general kernel estimators are consistent. For the second data set all estimators are consistent. The performance of each method is evaluated in terms of the departure of the empirical levels from the 0.95 nominal level. The results for the 0.25 and 0.75 quantiles are intermediate and are therefore omitted. The results at the 0.25 quantile fall between the 0.10 and 0.50 quantiles, but closer to the 0.50 quantile. Similarly, the results at the 0.75 quantile fall between the 0.90 and 0.50 quantile, and again closer to the 0.50 quantile. The values in each table are grouped by quantiles. The four lines in each group report the empirical level for the constant, the coefficient on education, and the coefficients on experience and experience squared, respectively. 6.1. Order statistic (OS) estimator The results for the order statistic estimator are reported in panels A and B of Table 2 for the quantile and censored quantile models using the original CPS data. Similarly, panels C and D report the results for the independent CPS data.’ 7 While the order statistic estimator is not consistent for the former data, its computational simplicity makes it quite attractive. Estimates of 6; based on a larger confidence interval yield larger (and closer to the 0.95 nominal level) empirical levels uniformly across all quantiles and parameter estimates. Nevertheless, the empirical levels for a particular E vary significantly both across quantiles and across the parameters tending to be smaller at the higher quantiles for all sample sizes. “The results for a = 0.05 fall between therefore omitted.
the two reported
results for a = 0.10 and c( = 0.01 and are
M. BuchinskylJournal
318
Table 2 Empirical
levels for order statistic,
Monte Carlo 100 0.10”
Panel A: QR
3000 repetitions
sample size 500
O.Olb
of Econometrics 68 (1995) 303338
0.10”
too
1000 O.Olb
Original
0.10”
O.Olb
sample
0.10”
500 O.Olb
Panel C: QR
0.10”
1000 O.Olb
Independent
0.10”
O.Olb
sample
0.10 quantile 0.852 0.836 0.855 0.835
0.977 0.977 0.972 0.956
0.907 0.912 0.893 0.853
0.929 0.927 0.918 0.879
0.909 0.912 0.883 0.850
0.926 0.931 0.903 0.877
0.868 0.875 0.871 0.88 1
0.977 0.977 0.975 0.974
0.93 1 0.933 0.932 0.932
0.946 0.946 0.947 0.948
0.917 0.92 1 0.925 0.925
0.936 0.944 0.944 0.945
0.910 0.908 0.891 0.862
0.917 0.924 0.896 0.861
0.921 0.915 0.895 0.867
0.921 0.913 0.894 0.867
0.901 0.903 0.918 0.916
0.913 0.913 0.928 0.925
0.934 0.934 0.928 0.925
0.943 0.938 0.935 0.934
0.935 0.935 0.939 0.937
0.932 0.937 0.936 0.938
0.869 0.911 0.841 0.851
0.887 0.925 0.860 0.871
0.835 0.880 0.839 0.850
0.856 0.900 0.859 0.867
0.807 0.804 0.808 0.810
0.926 0.923 0.93 1 0.928
0.918 0.925 0.922 0.922
0.939 0.937 0.932 0.937
0.919 0.921 0.913 0.921
0.934 0.939 0.929 0.937
0.50 quantile 0.892 0.908 0.870 0.843
0.900 0.910 0.893 0.855
0.90 quantile 0.757 0.810 0.727 0.731
0.888 0.922 0.867 0.871
Panel B: CQR
Original
sample
Panel D: CQR ~ Independent
sample
0.10 quantile 0.837 0.864 0.793 0.771
0.973 0.968 0.959 0.936
0.901 0.904 0.911 0.883
0.920 0.902 0.942 0.926
0.911 0.912 0.897 0.874
0.939 0.929 0.917 0.903
0.867 0.860 0.863 0.859
0.971 0.967 0.969 0.964
0.934 0.937 0.933 0.934
0.952 0.953 0.947 0.947
0.923 0.915 0.922 0.920
0.942 0.935 0.938 0.938
0.849 0.846 0.843 0.818
0.873 0.858 0.856 0.833
0.869 0.862 0.858 0.822
0.880 0.870 0.861 0.825
0.891 0.882 0.873 0.867
0.913 0.904 0.905 0.904
0.895 0.889 0.899 0.896
0.906 0.902 0.913 0.910
0.910 0.905 0.892 0.895
0.917 0.913 0.908 0.907
0.774 0.813 0.756 0.776
0.827 0.860 0.826 0.844
0.795 0.852 0.783 0.808
0.842 0.862 0.842 0.849
0.718 0.721 0.744 0.769
0.750 0.745 0.776 0.804
0.792 0.790 0.802 0.807
0.849 0.856 0.861 0.866
0.834 0.839 0.830 0.845
0.868 0.876 0.868 0.870
0.50 quantile 0.849 0.839 0.841 0.8 13
0.868 0.869 0.882 0.846
0.90 quantile 0.727 0.773 0.679 0.737
0.798 0.826 0.758 0.812
Note: The basic population is the 1987 annual CPS for white males (75,578 observations). dependent variable is log usual earnings. The covariates are constant, education, experience, experience squared. ‘LX= 0.10, corresponds ‘x = 0.01, corresponds
to Z1 _ll,z = 1.645. to Z, _z,z = 2.576.
The and
M.
Buchinsky/.lournal
of Econometrics
68 (1995)
303338
319
Comparison of panels A and B shows that the empirical levels for the censored quantile regression are slightly lower than for the quantile regression, mostly at the 0.90 quantile which is affected severely by censoring. In contrast, at the 0.10 quantile, which is unaffected by censoring, the empirical levels are comparable. In general, the empirical levels for the larger sample sizes are closer to the 0.95 nominal level across all quantiles, most noticeably at the 0.90 quantile. The fact that for a given LX the performance of the OS estimator is enhanced only slightly as the sample size increases, suggests that it converges rapidly to its population value. It is not the appropriate covariance matrix, however, for the original CPS data. This is verified by the results reported in panels C and D using the independent CPS data. The performance of the OS improves significantly at all quantiles, for all levels of a and for all sample sizes. Also the OS yields, in general, better results for the education and the constant coefficients than for the two experience coefficients. In summary, the OS performs reasonably well when the independence assumption is satisfied. In fact, as will be clear from the results below, it is the most reliable among the estimators that are valid only under the independence assumption, This is significant since the computation time required for this estimator is a few seconds, while the time for the bootstrap methods is rather lengthy. 6.2. Sigma bootstrapping
(SB) estimator
The performance of the SB estimator is examined for both the original and the independent CPS data, even though it is valid only for the latter. The results are reported in Table 3. The first three columns in each panel pertain to a sample size of 100 with bootstrap sample sizes of 50,100, and 500. The last three columns are for a sample size of 500, with bootstrap sample sizes of 100, 500, and 1000. The table indicates that the SB estimator does not perform well for relatively small samples in either data set. Moreover, it is very sensitive to the size of the bootstrap sample even for the independent sample. Smaller bootstrap sample sizes yield empirical levels closer to the 0.95 nominal level. The SB performance improves significantly when the sample size increases, especially for the independent sample, providing reasonable empirical levels for small bootstrap sample sizes for data sample sizes of 500 (see panels C and D). The consistently low empirical levels imply that the SB estimator yields too small standard errors. Note, however, that the empirical levels are much closer to the 0.95 level for the independent data. The performance of the SB improves for large samples because F,,( .) is estimated more accurately. The reason for the decline in the empirical levels for large bootstrap sample sizes is more difficult to explain. There are two effects working in opposite directions: (a) as m becomes larger the bootstrap estimates
M. BuchinskylJournal
320
Table 3 Empirical
levels for sigma bootstrap,
Monte Carlo
1000 repetitions
sample size
100
500
Bootstrap 50
of Econometrics 68 (1995) 303338
100
500
sample size
100
Panel A: QR
500
100
Original
500
1000
sample
50
100
500
100
Panel C: QR  Independent
500
1000
sample
0.10 quantile 0.882 0.887 0.850 0.822
0.842 0.821 0.807 0.771
0.560 0.572 0.553 0.532
0.911 0.938 0.884 0.852
0.868 0.885 0.857 0.836
0.853 0.854 0.822 0.779
0.851 0.859 0.865 0.866
0.793. 0.786 0.802 0.800
0.546 0.554 0.551 0.562
0.941 0.935 0.931 0.924
0.906 0.898 0.888 0.868
0.867 0.869 0.849 0.842
0.748 0.746 0.688 0.664
0.924 0.926 0.900 0.862
0.920 0.902 0.874 0.833
0.888 0.863 0.854 0.829
0.915 0.920 0.906 0.896
0.883 0.882 0.884 0.866
0.764 0.743 0.746 0.737
0.938 0.937 0.936 0.938
0.912 0.904 0.917 0.915
0.896 0.885 0.903 0.900
0.512 0.569 0.503 0.490
0.852 0.893 0.835 0.830
0.798 0.856 0.801 0.810
0.787 0.832 0.768 0.762
0.825 0.824 0.862 0.864
0.764 0.766 0.803 0.805
0.561 0.569 0.594 0.586
0.910 0.920 0.933 0.928
0.881 0.878 0.902 0.902
0.855 0.855 0.872 0.872
0.50 quantile 0.901 0.891 0.857 0.837
0.865 0.873 0.827 0.810
0.90 quantile 0.792 0.808 0.768 0.757
0.721 0.765 0.724 0.727
Panel B: CQR
Original
sample
Panel D: CQR  Independent
sample
0.10 quantile 0.910 0.929 0.894 0.842
0.854 0.869 0.792 0.771
0.553 0.571 0.569 0.523
0.919 0.918 0.903 0.892
0.881 0.893 0.852 0.832
0.821 0.827 0.839 0.816
0.901 0.895 0.892 0.885
0.857 0.842 0.833 0.832
0.596 0.588 0.58 1 0.589
0.925 0.933 0.939 0.936
0.897 0.898 0.894 0.890
0.871 0.877 0.872 0.870
0.672 0.624 0.657 0.655
0.871 0.856 0.894 0.868
0.824 0.810 0.822 0.787
0.806 0.796 0.827 0.795
0.902 0.911 0.910 0.917
0.879 0.884 0.901 0.898
0.768 0.776 0.778 0.790
0.951 0.947 0.949 0.946
0.937 0.933 0.940 0.930
0.935 0.930 0.937 0.926
0.441 0.462 0.408 0.465
0.718 0.766 0.596 0.650
0.481 0.539 0.401 0.450
0.583 0.643 0.474 0.541
0.835 0.833 0.837 0.855
0.795 0.803 0.795 0.812
0.638 0.644 0.643 0.658
0.920 0.924 0.927 0.935
0.878 0.880 0.884 0.888
0.859 0.852 0.857 0.859
0.50 quantile 0.912 0.905 0.861 0.833
0.875 0.859 0.835 0.802
0.90 quantile 0.839 0.871 0.799 0.814
0.742 0.782 0.709 0.737
Note: See note to Table 2. Also, each bootstrap
estimate
is computed
based on 100 repetitions.
M. BuchinskyjJournal
of Econometrics 68 (1995) 303338
321
are more likely to be closer to each other, consequently leading to a smaller covariance estimate, and (b) larger m tends to increase the covariance matrix estimate since it multiplies the sum in (13). Apparently, the first effect dominates. In summary, the SB estimator performs reasonably well when the independent assumption, under which it is consistent, is satisfied. For data which do not satisfy the independence restriction, it seems to perform poorly. However, the SB estimator is easy to compute  on the order of minutes, significantly less than for the other bootstrap estimators. 6.3. Design matrix bootstrap (DMB) estimator The results for the DMB estimator are summarized in Tables 4 and 5 for sample sizes of 100 and 500 observations, respectively. Panel A of each table reports the results for the quantile regression model, while panel B is for the censored quantile regression model. The first six columns in each panel report the results for the two sample sizes using the original CPS data for the DMBE, DMBA, and DMBP. Similarly, the last six columns report the results for the independent CPS data. The DMB estimators yields empirical levels very close to the 0.95 nominal level, usually within approximately one standard error. The DMBE is the most precise, while the DMBP yields the lowest empirical levels. This suggests that 100 bootstrap repetitions may be insufficient for directly constructing confidence intervals. All three versions of the DMB estimator perform equally well for the original and the independent data, and for both models. This is encouraging as one would like to guard against possible heteroskedasticity, but not at the expense of poor performance when independence is actually satisfied. Another apparent result is the robustness of the DMB estimates to changes in the relative size of bootstrap to data sample size. The empirical levels for DMBE are virtually the same for the two bootstrap sample sizes. The other two estimators yield empirical levels slightly better for the smaller bootstrap sample size. A data sample size of 500 yields even better results (see Table 5). The performance of all three DMB procedures, in both panels of Table 5, is extremely good. At all quantiles and for all coefficients the empirical levels are near the 0.95 nominal level, with the largest deviation being 0.030. All three procedures are insensitive to the bootstrap sample size. This is of signal importance, especially for the censored quantile regression, since using a small bootstrap sample size reduces significantly the computational costs without affecting the covariance matrix estimate. (It takes 10 to 20 seconds to run a quantile regression with 4 variables atid 500 observations.) In simulations with sample sizes larger than 500 observations the DMB methods perform to complete satisfaction. The empirical levels for all the DMB methods are very close to the
M. BuchinskylJournal
322
Table 4 Empirical Original
levels for design matrix
bootstrap,
sample size 100, 1000 repetitions
Independent
CPS sample
sample size 500
100 A
Panel A
68 (1995) 303338
Monte Carlo
CPS sample
Bootstrap
E
qf Econometrics
P Quantile
100
E
500
A
P
E
A
P
E
A
P
regression
0.10 quantile 0.952 0.960 0.957 0.958
0.952 0.943 0.944 0.949
0.919 0.925 0.928 0.934
0.973 0.959 0.963 0.950
0.966 0.938 0.966 0.937
0.955 0.928 0.948 0.937
0.971 0.977 0.961 0.950
0.951 0.966 0.939 0.929
0.946 0.944 0.930 0.916
0.966 0.964 0.951 0.951
0.947 0.945 0.935 0.930
0.936 0.928 0.921 0.920
0.941 0.953 0.959 0.960
0.965 0.962 0.971 0.944
0.951 0.950 0.952 0.935
0.944 0.946 0.946 0.931
0.957 0.963 0.966 0.961
0.943 0.938 (I.951 0.941 0.954 0.951 0.950 0.944
0.963 0.964 0.952 0.950
0.950 0.954 0.943 0.941
0.944 0.934 0.941 0.933
0.941 0.965 0.905 0.929
0.953 0.955 0.947 0.943
0.929 0.940 0.943 0.942
0.909 0.922 0.907 0.922
0.963 0.955 0.948 0.943
0.949 0.939 0.930 0.927
0.941 0.925 0.929 0.920
0.944 0.949 0.940 0.939
0.929 0.921 0.924 0.929
0.906 0.905 0.902 0.904
0.50 quantile 0.972 0.962 0.967 0.969
0.956 0.943 0.964 0.951
0.90 quantile 0.964 0.970 0.950 0.949
0.946 0.956 0.917 0.940
Panel B  Censored
quantile
regression
0.10 quantile 0.961 0.961 0.931 0.942
0.932 0.952 0.909 0.922
0.934 0.937 0.908 0.912
0.931 0.914 0.891 0.909
0.912 0.891 0.892 0.891
0.875 0.850 0.865 0.860
0.958 0.957 0.961 0.968
0.935 0.940 0.941 0.950
0.936 0.942 0.935 0.938
0.967 0.960 0.975 0.966
0.956 0.947 0.962 0.953
0.942 0.932 0.946 0.945
0.908 0.932 0.972 0.947
0.941 0.932 0.956 0.945
0.906 0.924 0.947 0.925
0.893 0.903 0.927 0.902
0.972 0.966 0.970 0.968
0.959 0.957 0.958 0.958
0.953 0.947 0.957 0.953
0.960 0.951 0.959 0.968
0.951 0.940 0.948 0.959
0.943 0.926 0.941 0.949
0.924 0.934 0.891 0.914
0.959 0.968 0.967 0.972
0.924 0.942 0.925 0.940
0.891 0.915 0.895 0.916
0.982 0.980 0.986 0.989
0.951 0.953 0.956 0.967
0.931 0.923 0.935 0.952
0.968 0.964 0.970 0.979
0.950 0.941 0.948 0.958
0.921 0.905 0.926 0.946
0.50 quantile 0.952 0.945 0.971 0.964
0.931 0.946 0.971 0.956
0.90 quantile 0.968 0.969 0.972 0.962
0.954 0.965 0.914 0.923
Note: See note to Table 2. The estimators
are: E = DMBE,
A = DMBA,
P = DMBP.
M. Buchinsky/Journal
Table 5 Empirical Original
levels for design matrix
bootstrap,
Carlo
sample size 500, 1000 repetitions CPS sample
sample size
100
500 A
Panel A
Monte
323
68 (I 995) 303338
Independent
CPS sample
Bootstrap
E
qf’ Econometrics
P Quantile
100
E
500
A
P
E
A
P
E
A
P
regression
0.10 quantile 0.942 0.940 0.955 0.953
0.926 0.930 0.940 0.937
0.922 0.937 0.946 0.930
0.956 0.944 0.946 0.950
0.940 0.946 0.930 0.933
0.936 0.925 0.930 0.933
0.954 0.964 0.948 0.941
0.946 0.954 0.938 0.932
0.947 0.949 0.934 0.927
0.941 0.950 0.942 0.940
0.928 0.937 0.931 0.925
0.929 0.933 0.930 0.919
0.938 0.951 0.929 0.946
0.956 0.947 0.953 0.951
0.939 0.927 0.958 0.944
0.921 0.906 0.944 0.944
0.963 0.955 0.945 0.953
0.956 0.947 0.940 0.948
0.947 0.950 0.940 0.945
0.934 0.946 0.946 0.960
0.930 0.938 0.937 0.950
0.919 0.934 0.939 0.948
0.930 0.944 0.931 0.934
0.950 0.963 0.946 0.943
0.944 0.940 0.935 0.937
0.929 0.938 0.933 0.937
0.945 0.942 0.952 0.948
0.934 0.933 0.942 0.939
0.930 0.926 0.942 0.943
0.946 0.932 0.949 0.946
0.927 0.917 0.937 0.934
0.931 0.915 0.934 0.924
0.50 quantile 0.966 0.969 0.940 0.947
0.967 0.969 0.945 0.945
0.90 quantile 0.948 0.953 0.941 0.948
0.938 0.939 0.937 0.945
Panel B  Censored
quantile
regression
0.10 quantile 0.962 0.942 0.959 0.950
0.941 0.942 0.955 0.941
0.937 0.941 0.947 0.942
0.91 I 0.946 0.937 0.925
0.903 0.930 0.931 0.922
0.904 0.939 0.916 0.926
0.952 0.946 0.957 0.956
0.944 0.941 0.944 0.947
0.943 0.941 0.940 0.942
0.948 0.945 0.938 0.936
0.936 0.934 0.922 0.916
0.926 0.925 0.921 0.916
0.927 0.925 0.944 0.961
0.950 0.941 0.944 0.946
0.937 0.916 0.934 0.935
0.931 0.924 0.951 0.944
0.955 0.945 0.939 0.949
0.943 0.937 0.931 0.943
0.942 0.932 0.936 0.939
0.952 0.946 0.941 0.929
0.944 0.934 0.933 0.916
0.939 0.930 0.930 0.916
0.924 0.919 0.848 0.890
0.967 0.971 0.962 0.981
0.957 0.962 0.962 0.981
0.948 0.943 0.943 0.957
0.956 0.971 0.957 0.964
0.936 0.956 0.939 0.954
0.935 0.934 0.911 0.929
0.963 0.971 0.945 0.942
0.940 0.940 0.926 0.912
0.928 0.930 0.905 0.895
0.50 quantile 0.950 0.950 0.962 0.959
0.933 0.932 0.965 0.950
0.90 quantile 0.967 0.971 0.967 0.971
0.938 0.948 0.895 0.910
Note:
See note to Table 4.
324
M. BuchinskylJournal
of Econometrics 68 (1995) 303338
0.95 nominal level for all coefficients at all quantiles, even for bootstrap sample sizes onetenth the original sample size. The major disadvantage of the DMB method is its computational cost, which is quite high even for relatively small bootstrap sample sizes. When the independence of x and u0 is obvious, the OS estimator might be preferable. 6.4. Error bootstrapping
(EB) estimator
The results for the EB estimator are reported in Table 6 for sample sizes of 100 and 500 observations using only the independent CPS data. Panels A and B report the results for the quantile and censored quantile regression models, respectively. For the 100 sample size two bootstrap sample sizes of 50 and 100 observations were considered. For the 500 sample size the bootstrap samples employed are of 100 and 500 observations. The results for the original CPS data are omitted since the EB is as costly to compute as the DMB, while it is not consistent. The EB estimator performs quite well at the middle quantile when the bootstrap sample size is small relative to the data sample size. When the bootstrap sample size is large relative to the data sample size, the empirical levels fall well below the 0.95 nominal level, especially at quantiles affected by censoring. The empirical levels at the 0.90 quantile where there is censoring are considerably lower than those at the 0.50 quantile. Note also that the empirical levels are not uniform across the three EB estimators. A feature common to the DMB and EB estimators is that for both the quantile and the censored quantile regression models the EBE estimator seems to perform slightly better than the EBA estimator. Both estimators perform significantly better than the EBP estimator. Apparently, more bootstrap repetitions are needed for the percentile method in order to be able to precisely compute confidence intervals for the parameters. The three EB estimators are less sensitive than the SB estimator to bootstrap sample size, but for similar reasons their performance deteriorates significantly for large bootstrap sample sizes. For large sample sizes this becomes less of a problem for the EB estimator than for the SB estimator. The EB estimator results for a sample of 1000 observations are similar to the 500 sample size and are therefore omitted. Nevertheless, it is worth noting that even for this relatively large sample size, the EB methods tend to perform better for relatively small bootstrap sample sizes. 6.5. Homoskedastic
kernel (HK) estimator
Both a uniform and a normal kernel function are considered for the homoskedastic kernel estimator. For each of these functions one and twoside kernel estimators forf,,(O) are examined. The results are reported in Table 7 for the
M. BuchinskylJournal
Table 6 Empirical
levels for error bootstrap,
Monte Carlo Bootstrap
sample
sample
independent
Panel A
1000 repetitions
Monte Carlo
size 100
100 A
CPS data,
325
sample size 500
size
50 E
of Econometrics 68 (1995) 303338
P Quantile
100
E
500
A
P
E
A
P
E
A
P
regression
0.10 quantile 0.924 0.926 0.927 0.931
0.921 0.925 0.924 0.927
0.907 0.915 0.910 0.913
0.890 0.878 0.896 0.898
0.884 0.873 0.890 0.892
0.872 0.872 0.872 0.882
0.944 0.943 0.940 0.945
0.939 0.941 0.937 0.942
0.935 0.933 0.924 0.936
0.891 0.912 0.896 0.908
0.882 0.906 0.890 0.902
0.872 0.900 0.898 0.898
0.938 0.940 0.928 0.932
0.926 0.914 0.920 0.923
0.923 0.912 0.917 0.922
0.913 0.894 0.916 0.914
0.949 0.954 0.938 0.939
0.946 0.950 0.933 0.937
0.943 0.938 0.920 0.933
0.935 0.931 0.918 0.923
0.927 0.920 0.916 0.915
0.935 0.924 0.930 0.915
0.920 0.926 0.932 0.922
0.888 0.864 0.883 0.897
0.881 0.862 0.880 0.891
0.870 0.864 0.878 0.886
0.967 0.977 0.974 0.983
0.959 0.973 0.973 0.983
0.955 0.959 0.959 0.970
0.927 0.938 0.932 0.946
0.91 I 0.904 0.931 0.918 0.928 0.924 0.938 0.930
0.50 quantile 0.943 0.945 0.937 0.943
0.942 0.944 0.935 0.939
0.90 quantile 0.935 0.930 0.933 0.932
0.931 0.926 0.931 0.929
Panel B
Censored
quantile
regression
0.10 quantile 0.921 0.902 0.912 0.928
0.914 0.894 0.909 0.924
0.893 0.872 0.901 0.917
0.865 0.878 0.874 0.882
0.859 0.874 0.870 0.881
0.850 0.860 0.857 0.872
0.944 0.943 0.940 0.945
0.939 0.941 0.937 0.942
0.935 0.933 0.924 0.936
0.891 0.912 0.896 0.908
0.882 0.906 0.890 0.902
0.872 0.900 0.898 0.898
0.927 0.913 0.918 0.926
0.892 0.893 0.880 0.888
0.888 0.886 0.874 0.883
0.881 0.880 0.874 0.877
0.949 0.954 0.938 0.939
0.946 0.950 0.933 0.937
0.943 0.938 0.920 0.933
0.935 0.931 0.918 0.923
0.927 0.920 0.916 0.915
0.935 0.924 0.930 0.915
0.828 0.831 0.850 0.923
0.731 0.745 0.784 0.836
0.728 0.739 0.777 0.833
0.650 0.667 0.708 0.779
0.967 0.977 0.974 0.983
0.959 0.973 0.973 0.983
0.955 0.959 0.959 0.970
0.927 0.938 0.932 0.946
0.911 0.931 0.928 0.938
0.904 0.918 0.924 0.930
0.50 quantile 0.938 0.940 0.935 0.939
0.934 0.936 0.930 0.934
0.90 quantile 0.907 0.926 0.928 0.964
0.900 0.919 0.925 0.963
Note: See note to Table 4. The marginal cumulative distribution function of the error ug is estimated using the KaplanMeier procedure. The estimators are: E = EBE, A = EBA, P = EBP.
M. BuchinskyJJournal
326
Table 7 Empirical levels for homoskedastic repetitions Uniform
kernel
Oneside
(c,)
0.20
kernel, original
Twoside a”
1.5
Panel A  Quantile
0.20
of Econometrics
(c.)
68 (1995) 303338
CPS data, Monte Carlo
Normal
kernel
Oneside
(c.)
sample size 500, 5000
Twoside
(c.)
1.5
aa
0.20
1.5
a’
0.20
1.5
aa
regression
0.10 quantile 0.808 0.806 0.762 0.723
0.629 0.616 0.585 0.538
0.806 0.807 0.782 0.761
0.900 0.899 0.870 0.829
0.840 0.834 0.800 0.757
0.846 0.844 0.821 0.804
0.754 0.755 0.715 0.676
0.782 6.785 0.742 0.705
0.807 0.811 0.783 0.767
0.899 0.902 0.862 0.843
0.974 0.977 0.960 0.945
0.857 0.861 0.834 0.831
0.851 0.843 0.802 0.800
0.940 0.931 0.897 0.873
0.998 0.996 0.989 0.983
0.891 0.884 0.844 0.845
0.949 0.945 0.919 0.890
1.00 1.00 1.00 1.00
0.853 0.849 0.809 0.812
0.966 0.960 0.937 0.916
1.00 1.00 1.00 1.00
0.883 0.891 0.851 0.850
0.801 0.798 0.756 0.749
0.859 0.896 0.854 0.859
0.914 0.944 0.911 0.917
0.839 0.841 0.795 0.791
0.988 0.996 0.992 0.990
1.00 1.00 1.00 1.00
0.804 0.802 0.772 0.751
0.874 0.912 0.874 0.878
0.998 1.00 0.999 0.999
0.850 0.849 0.816 0.814
0.50 quantile 0.937 0.927 0.892 0.866
1.00 1.00 I.00 I.00
0.90 quantile 0.960 0.976 0.961 0.959
1.00 I.00 1.00 1.00
Panel B
Censored
quantile
regression
0. IO quantile 0.783 0.785 0.783 0.755
0.598 0.599 0.601 0.569
0.808 0.808 0.781 0.763
0.887 0.889 0.879 0.857
0.794 0.801 0.788 0.762
0.845 0.846 0.820 0.802
0.749 0.752 0.713 0.673
0.774 0.779 0.736 0.697
0.809 0.812 0.782 0.769
0.898 0.899 0.860 0.839
0.972 0.974 0.958 0.941
0.860 0.863 0.836 0.834
0.854 0.845 0.803 0.799
0.911 0.905 0.899 0.874
0.995 0.993 0.996 0.991
0.892 0.886 0.843 0.848
0.894 0.885 0.883 0.852
1.00 1.00 1.00 1.00
0.856 0.851 0.810 0.811
0.929 0.926 0.918 0.896
1.00 I.00 1.00 1.00
0.885 0.894 0.850 0.852
0.797 0.793 0.751 0.746
0.960 0.979 0.885 0.933
0.980 0.992 0.937 0.978
0.835 0.834 0.794 0.787
0.872 0.946 0.846 0.910
1.00 1.00 1.00 1.00
0.801 0.803 0.769 0.750
0.598 0.737 0.605 0.736
0.984 0.997 0.969 0.986
0.847 0.846 0.815 0.811
0.50 quantile 0.890 0.884 0.883 0.854
I.00 1.00 1.00 1.00
0.90 quantile 0.979 0.990 0.946 0.978
1.00 1.00 1.00 1.00
Note: See note to Table 2. The formula ’ Bandwidth
is selected automatically,
for the asymptotic
covariance
using the leastsquares
matrix is given in Section 3.
crossvalidation
method.
M.
BuchinskylJournal
Table 8 Empirical levels for homoskedastic 5000 repetitions Uniform
kernel
Oneside
(c,)
0.20
1.5
kernel, independent
Twoside a”
Panel A  Quantile
0.20
of Econometrics
(c.)
68 (1995)
CPS data,
Normal
kernel
Oneside
(c.)
303338
Monte
327
Carlo
sample
Twoside
size 500,
(c.)
1.5
aa
0.20
1.5
aa
0.20
1.5
aa
regression
0.10 quantile 0.807 0.814 0.815 0.808
0.633 0.640 0.638 0.645
0.856 0.856 0.826 0.803
0.904 0.902 0.902 0.900
0.842 0.842 0.847 0.842
0.894 0.886 0.867 0.848
0.775 0.774 0.780 0.782
0.802 0.805 0.810 0.809
0.858 0.860 0.830 0.811
0.908 0.909 0.913 0.912
0.978 0.978 0.981 0.980
0.909 0.905 0.880 0.876
0.900 0.895 0.852 0.844
0.948 0.949 0.939 0.935
0.999 0.998 0.997 0.998
0.947 0.939 0.893 0.894
0.954 0.957 0.953 0.952
1.00 1.00 1.00 1.00
0.902 0.898 0.859 0.858
0.969 0.970 0.968 0.968
1.00 1.00 1.00 1.00
0.939 0.946 0.901 0.899
0.849 0.852 0.807 0.799
0.925 0.932 0.927 0.928
0.967 0.969 0.961 0.959
0.885 0.888 0.843 0.835
0.998 0.999 0.999 0.999
1.00 1.00 1.00 1.00
0.852 0.853 0.823 0.801
0.941 0.941 0.948 0.947
1.00 1.00 1.00 1.00
0.896 0.901 0.866 0.860
0.50 quantile 0.946 0.939 0.937 0.933
1.00 1.00 1.00 1.00
0.90 quantile 0.991 0.993 0.991 0.991
1.00 1.00 1.00 1.00
Panel B
Censored
quantile
regression
0. IO quantile 0.818 0.828 0.822 0.822
0.616 0.616 0.612 0.621
0.856 0.856 0.826 0.803
0.912 0.917 0.908 0.915
0.816 0.831 0.818 0.823
0.890 0.886 0.864 0.843
0.755 0.748 0.769 0.768
0.783 0.779 0.797 0.795
0.859 0.860 0.830 0.810
0.902 0.897 0.910 0.907
0.974 0.973 0.978 0.974
0.909 0.908 0.883 0.877
0.900 0.895 0.852 0.844
0.917 0.914 0.914 0.911
0.996 0.998 0.997 0.998
0.946 0.938 0.893 0.895
0.918 0.923 0.917 0.918
1.00 1.00 1.00 1.00
0.902 0.898 0.859 0.858
0.952 0.956 0.952 0.951
I.00 1.00 1.00 1.00
0.939 0.950 0.898 0.899
0.848 0.846 0.805 0.797
0.851 0.854 0.868 0.879
0.963 0.967 0.960 0.968
0.884 0.884 0.843 0.834
0.990 0.991 0.995 0.997
1.00 1.00 1.00 1.00
0.852 0.853 0.823 0.801
0.924 0.922 0.939 0.940
1.00 1.00 1.00 1.00
0.896 0.899 0.866 0.855
0.50 quantile 0.896 0.891 0.894 0.889
1.00 1.00 1.00 1.00
0.90 quantile 0.960 0.964 0.968 0.974
1.00 1.00 1.00 1.00
Note: See note to Table 7. a Bandwidth
is selected automatically,
using the leastsquares
crossvalidation
method
328
M. Buchinsky/Journal
of Econometrics 68 (1995) 303338
original CPS data, and in Table 8 for the independent CPS data. The results for the oneside uniform kernel are reported in the first three columns of each table for bandwidth values of 0.20, 1.5, and also an automatically chosen bandwidth.” The next three columns report similar results for the twoside uniform kernel. The results for the normal kernel are organized similarly in the last six columns. Below a bandwidth of 0.20 the empirical levels are very low, and above 1.5 bandwidth they are always around 1.0. Table 7 shows considerable fluctuation in the empirical levels. These very between a very low level of 0.60 to the highest attainable level of 1.0, for two quantiles using the same bandwidth. Less variation can be detected for the automatic bandwidth, but its empirical levels are well below the 0.95 mark. The twoside kernels yield, in general, empirical levels closer to the 0.95 level than the oneside kernel, with the normal kernel performing better than the uniform kernel. In summary, this estimator performs quite poorly for the original CPS data. Table 8 presents the results for the HK estimator using the independent CPS data with sample size of 500 observations, for the same kernels and bandwidths as in Table 7. For samples smaller than 500 the HK estimator is very sensitive and inaccurate. The resulting empirical levels for the independent data are much better than for the original data, especially for the twoside kernels. The empirical levels are much closer to 0.95, for both the quantile and censored quantile regression models. Some deficiencies do remain though, the most important being that the empirical levels are not uniform across quantiles for any given value of the bandwidth. Databased choices of c, yield better empirical levels, but they are still consistently lower than the 0.95 level. Furthermore, they are somewhat variable both across quantiles and across coefficients. Note that the performance of the HK estimator for the quantile and censored quantile models are comparable, even at the 0.90 quantile which is affected enormously by censoring. As for the quantile regression model, the twoside kernels perform better than the oneside kernels, with the twoside normal kernel performing the best. At a relatively large bandwidth the estimated standard errors are consistently too large. 6.6. General kernel (GK) estimator
The results for the GK estimator using one and twoside uniform and normal kernels for the original CPS data are reported in Tables 9 and 10. The results for the independent CPS data are reported in Table 11. The order of the columns is the same as in Table 7.
‘s The automatic man, 1986).
bandwidth
is chosen using the leastsquares
crossvalidation
method
(e.g., Silver
M. BuchinskylJournal
of Econometrics 68 (1995) 303338
329
The GK estimator generally performs better than the HK estimator even for the independent data. This is a major advantage for the GK since it is computationally inexpensive, while at the same time guarding against heteroskedasticity. One may note, however, that the DMB estimator dominates the GK estimator. The results in Table 9 for the quantile regression model are not completely satisfactory. For the oneside kernel estimator, all bandwidths, including the automatic bandwidth, yield empirical levels that are too high (relative to the 0.95 nominal level) at the higher quantiles and too low at the lower quantiles. Much better results are obtained for the twoside kernel estimator. The results for the larger sample are better (see panel B) for all estimators, especially for the twoside kernel with the automatic bandwidth. The results also indicate that distinct bandwidths are required at different quantiles. The results in Table 10 for the censored quantile regression model are consistent with, but less precise than, those in Table 9. The results are similar for the 0.10 and 0.50 quantiles. At the 0.90 quantile, however, the empirical levels drop when using fixed bandwidths, but not for the automatic bandwidth. Also, some variation in the empirical level is noticeable across the parameters at any given quantile. Table 11 reports the results for the censored quantile regression models using the independent data. (The results for the quantile regression are very similar and are therefore omitted.) Clearly, the empirical levels are very sensitive to the bandwidth choices. Small bandwidths yield too low empirical levels, while large bandwidths yield too high values. The automatic bandwidth yields quite reasonable empirical levels especially for the sample of 500 observations. The overall performance of the general kernel estimator, and especially the twoside normal kernel, is reasonably good. These estimates are not as precise as the DMB estimates, but they are much less expensive to compute (taking only a few minutes).
7. On the relative magnitude of the standard errors The results presented do not explicitly indicate the magnitude of differences in the standard errors obtained for the various estimators, but they do provide enough information to extract this information. Suppose the average standard errors for a certain coefficient obtained by two competing methods are se’ and se*, and let [/$  Z, _or,2sej,fiO+ Z, _,,zsej] be the confidence interval associated with the observed empirical probability level pj (j = 1,2). If the true standard error se0 were known, a pj confidence interval would be [fl, Z (1+p,),2seo,~0 + Zc,+,j,i2se0]. Since, however, the length of the two confidence intervals must be the same, it follows that sei/seo = Z~l+pJ,,2/Z1_a,Z.
M.
330
Table 9 Empirical
levels for general
Uniform
kernel
Oneside
(c,)
0.20
Buchinsky/Journal
1.5
kernel, original
Twoside aa
of’ Econometrics
0.20
CPS data, quantile
(c,)
I.5
68 (1995)
u”
303338
regression,
Normal
kernel
Oneside
(c.)
5000 repetitions
Twoside
(c.)
0.20
1.5
,a
0.20
1.5
aa
Panel A  Monte Carlo sample size 100 0.10 quantile 0.805 0.792 0.772 0.737
0.645 0.640 0.643 0.633
0.827 0.821 0.792 0.776
0.834 0.825 0.796 0.760
0.884 0.884 0.854 0.837
0.838 0.838 0.810 0.791
0.787 0.771 0.751 0.719
0.808 0.811 0.785 0.759
0.859 0.845 0.826 0.812
0.884 0.883 0.832 0.793
0.979 0.977 0.974 0.965
0.898 0.890 0.877 0.898
0.951 0.939 0.942 0.920
0.917 0.909 0.913 0.885
0.997 0.995 0.996 0.993
0.927 0.938 0.933 0.934
0.957 0.948 0.952 0.934
1.00 1.00 1.00 1.00
0.934 0.949 0.931 0.954
0.967 0.967 0.948 0.934
1.00 1.00 1.00 1.00
0.971 0.977 0.960 0.933
0.960 0.942 0.934 0.933
0.909 0.883 0.899 0.879
0.951 0.962 0.957 0.957
0.948 0.929 0.925 0.929
0.996 0.988 0.992 0.988
1.00 1.00 1.00 1.00
0.924 0.936 0.929 0.926
0.930 0.922 0.918 0.912
1.00 1.00 1.00 1.00
0.960 0.938 0.944 0.928
0.50 quantile 0.932 0.922 0.921 0.898
1.00 1.00 1.00 1.00
0.90 quantile 0.983 0.970 0.974 0.966
1.00 1.00 1.00 1.00
Panel B
Monte Carlo
sample size 500
0.10 quantile 0.851 0.839 0.817 0.796
0.643 0.625 0.629 0.610
0.903 0.904 0.885 0.843
0.920 0.918 0.882 0.844
0.855 0.860 0.838 0.844
0.945 0.936 0.905 0.874
0.782 0.783 0.788 0.773
0.784 0.788 0.774 0.755
0.906 0.899 0.880 0.849
0.904 0.917 0.879 0.878
0.976 0.976 0.967 0.960
0.963 0.969 0.924 0.905
0.961 0.953 0.968 0.955
0.938 0.930 0.936 0.924
0.998 0.997 0.994 0.991
0.965 0.965 0.974 0.962
0.952 0.941 0.964 0.956
1.00 1.00 1.00 1.00
0.969 0.969 0.977 0.951
0.967 0.965 0.968 0.964
1.00 1.00 1.00 1.00
0.963 0.962 0.969 0.950
0.923 0.937 0.928 0.931
0.947 0.923 0.943 0.930
0.943 0.947 0.956 0.953
0.941 0.963 0.950 0.964
1.00 0.996 0.998 0.993
1.00 1.00 1.00 1.00
0.942 0.930 0.932 0.921
0.944
0.999
0.956
0.923
1.00 0.999 1.00
0.952 0.966 0.959
0.50 quantile 0.940 0.924 0.941 0.930
1.00 1.00 1.00 1.00
0.90 quantile 0.994 0.981 0.989 0.981 Note:
1.00 1.00 1.00 1.00
See note to Table
a Automatic
bandwidth
0.952 0.942
7. chosen
using the leastsquares
crossvalidation
method
M. BuchinskyiJournal
of’ Econometrics
Table IO Empirical levels for general kernel, original Uniform
kernel
Oneside
(c.)
0.20
1.5
Panel A
Twoside an
Monte Carlo
0.20
CPS data, censored
(c,)
1.5
us
331
68 (1995) 303338
quantile
Normal
kernel
Oneside
(c,)
regression,
5000 repetitions
Twoside
(c.)
0.20
1.5
a”
0.20
1.5
(la
sample size 100
0.10 quantile 0.787 0.770 0.771 0.747
0.825 0.823 0.793 0.780
0.829 0.817 0.800 0.763
0.858 0.862 0.849 0.838
0.841 0.839 0.808 0.794
0.757 0.758 0.731 0.714
0.797 0.810 0.768 0.750
0.857 0.842 0.823 0.813
0.866 0.866 0.825 0.778
0.979 0.979 0.973 0.961
0.899 0.886 0.876 0.897
0.954 0.940 0.943 0.923
0.866 0.853 0.891 0.866
0.997 0.995 0.997 0.991
0.928 0.937 0.934 0.936
0.901 0.882 0.925 0.908
1.00 1.00 1.00 1.00
0.937 0.951 0.933 0.955
0.932 0.919 0.940 0.925
1.00 1.00 1.00 1.00
0.975 0.979 0.961 0.935
0.950 0.931 0.925 0.921
0.752 0.742 0.791 0.810
0.920 0.931 0.935 0.949
0.938 0.924 0.919 0.923
0.931 0.911 0.966 0.960
1.00 1.00 1.00 0.999
0.914 0.928 0.922 0.917
0.760 0.779 0.841 0.849
0.998 0.997 1.00 0.999
0.952 0.928 0.935 0.922
Monte Carlo
sample
0.631 0.619 0.645 0.648
0.50 quantile 0.857 0.838 0.897 0.880
1.00 1.00 1.00 1.00
0.90 quantile 0.855 0.846 0.900 0.912
1.00 1.00 1.00 1.00
Panel B
size 500
0. IO quantile 0.847 0.849 0.830 0.813
0.604 0.605 0.635 0.624
0.901 0.906 0.886 0.847
0.921 0.920 0.882 0.854
0.798 0.821 0.838 0.850
0.948 0.937 0.903 0.877
0.772 0.777 0.764 0.768
0.783 0.783 0.755 0.744
0.904 0.896 0.877 0.850
0.899 0.907 0.868 0.867
0.969 0.972 0.963 0.947
0.964 0.965 0.923 0.904
0.964 0.954 0.969 0.958
0.881 0.874 0.929 0.915
0.994 .0.966 0.992 0.964 0.997 0.975 0.994 0.964
0.900 0.871 0.948 0.933
1.00 1.00 1.00 1.00
0.972 0.971 0.979 0.952
0.928 0.921 0.958 0.950
1.00 1.00 1.00 1.00
0.967 0.964 0.970 0.952
0.913 0.926 0.919 0.919
0.793 0.809 0.758 0.781
0.887 0.926 0.789 0.858
0.897 0.914 0.939 0.953
1.00 1.00 1.00 1.00
0.932 0.922 0.925 0.912
0.616 0.7 13 0.697 0.796
0.982 0.996 0.972 0.985
0.948 0.942 0.957 0.953
0.50 quantile 0.867 0.845 0.927 0.913
1.00 1.00 1.00 1.00
0.90 quantile 0.902 0.896 0.881 0.895 Note:
1.00 1.00 1.00 1.00
See note to Table
a Automatic
bandwidth
0.931 0.958 0.944 0.958
7. chosen
using the leastsquares
crossvalidation
method.
M. BuchinskylJournal
332
Table 11 Empirical repetitions
levels for general
Uniform
kernel
Oneside
(c,)
0.20
1.5
kernel,
Twoside aa
Panel A  Monte Carlo
0.20
of Econometrics
independent
(c,)
I.5
aa
CPS
68 (1995) 303338
data,
censored
Normal
kernel
Oneside
(c.)
quantile
regression,
Twoside
5000
(c.)
0.20
1.5
aa
0.20
1.5
aa
sample size 100
0.10 quantile 0.812 0.808 0.839 0.828
0.640 0.648 0.655 0.672
0.829 0.828 0.794 0.782
0.827 0.816 0.850 0.829
0.860 0.861 0.890 0.888
0.844 0.840 0.813 0.796
0.748 0.733 0.781 0.773
0.787 0.783 0.802 0.799
0.862 0.847 0.824 0.813
0.858 0.844 0.893 0.879
0.976 0.979 0.978 0.985
0.901 0.888 0.877 0.897
0.958 0.942 0.947 0.926
0.870 0.859 0.903 0.900
0.998 0.998 0.997 0.998
0.931 0.942 0.934 0.939
0.908 0.897 0.939 0.930
1.00 1.00 1.00 1.00
0.938 0.954 0.934 0.958
0.940 0.939 0.956 0.954
1.00 1.00 1.00 1.00
0.977 0.981 0.962 0.936
0.953 0.935 0.925 0.922
0.790 0.770 0.868 0.854
0.952 0.944 0.982 0.982
0.942 0.926 0.919 0.925
0.946 0.932 0.985 0.981
1.00 1.00 I.00 1.00
0.915 0.932 0.924 0.919
0.855 0.838 0.926 0.923
0.999 0.999 1.00 1.00
0.953 0.932 0.938 0.925
0.50 quantile 0.886 0.876 0.918 0.914
1.00 1.00 1.00 1.00
0.90 quantile 0.927 0.907 0.973 0.973
1.00 0.999 1.00 1.00
Panel B  Monte Carlo
sample size 500
0.10 quantile 0.866 0.861 0.862 0.853
0.625 0.617 0.619 0.628
0.902 0.909 0.889 0.850
0.918 0.918 0.923 0.908
0.802 0.803 0.847 0.856
0.951 0.941 0.903 0.879
0.789 0.769 0.802 0.796
0.789 0.785 0.795 0.793
0.906 0.900 0.880 0.854
0.904 0.904 0.911 0.910
0.976 0.975 0.972 0.972
0.968 0.966 0.927 0.905
0.966 0.955 0.970 0.961
0.895 0.877 0.922 0.920
0.997 0.998 0.997 0.999
0.971 0.966 0.977 0.966
0.901 0.896 0.936 0.934
1.00 1.00 1.00 1.00
0.976 0.975 0.982 0.954
0.939 0.938 0.954 0.960
1.00 1.00 1.00 1.00
0.969 0.964 0.972 0.953
0.914 0.930 0.920 0.919
0.883 0.873 0.890 0.888
0.968 0.970 0.966 0.976
0.935 0.959 0.945 0.960
0.989 0.983 0.993 0.995
1.00 1.00 1.00 1.00
0.937 0.925 0.927 0.915
0.934 0.927 0.935 0.937
1.00 1.00 1.00 1.00
0.952 0.944 0.957 0.954
0.50 quantile 0.877 0.866 0.918 0.920
1.00 1.00 1.00 1.00
0.90 q&tile 0.973 0.964 0.975 0.976
1.00 1.00 1.00 1.00
Note: See note to Table 7. a Automatic
bandwidth
chosen
using the leastsquares
crossvalidation
method.
M.
Table 12 Representative
BuchinskylJournal
standard
errors
Empirical
level
of Econometrics
for observed
empirical
68 (1995)
333
303338
levels and Monte
Carlo
sample
repetitions
Repetitions
0.600
0.650
0.700
0.750
0.800
0.850
0.875
0.900
0.925
0.950
0.975
1000 2000 3000 4000 5000
0.015 0.011 0.009 0.008 0.007
0.015 0.011 0.009 0.008 0.007
0.014 0.010 0.008 0.007 0.006
0.014 0.010 0.008 0.007 0.006
0.013 0.009 0.007 0.006 0.006
0.011 0.008 0.007 0.006 0.005
0.010 0.007 0.006 0.005 0.005
0.009 0.007 0.005 0.005 0.004
0.008 0.006 0.005 0.004 0.004
0.007 0.005 0.004 0.003 0.003
0.005 0.003 0.003 0.002 0.002
Nore: The standard errors are computed number of Monte Carlo repetitions.
Table 13 Implied standard
error ratios Z ,I
Z ,I +P’)iz
P2
0.842 1.037 1.282 1.645
0.80 0.85 0.90 0.95
Note:
for observed
+p’)iz
=
p’ =
according
empirical
to: G(p)
= ,,/E,
where Nut is the
levels
1.037 0.85
1.282 0.90
1.645 0.95
1.960 0.975
23.2% 0.0%
44.6% 17.4% 0.0%
95.4% 58.6% 28.3% 0.0%
132.8% 89.0% 52.9% 19.1%
The numbers in the table are the ratio between the columns and rows.
se1 to se’ for the empirical
levels reported
in
Hence the ratio of the average standard errors for any two methods is se’ 2= se
Z(r +pijj2 % +pz)i2
(231
As Table 13 illustrates the low empirical levels for the SB and EB methods imply that their standard errors are four to five times smaller than those for the DMB and GK methods. This disparity suggests that one should be cautious in using any of the above estimators in empirical applications. In particular, comparing alternative standard error estimates seem to be desirable. In order to evaluate whether the differences reflect actual disparities among the ‘population’ matrices or are the consequence of distinct small sample properties, the population covariance matrices at the 0.10, 0.50, and 0.90 quantiles have been computed. For the independent CPS data estimates using all estimators are computed, while for the original CPS data only the estimates for the DMB and GK estimators are computed. The results for the two data sets
334
Table 14 Population
M. BuchinskyfJournal
standard
Panel A  Uniform
errors
of Econometrics 68 (1995) 303338
for the independent
kernel
CPS data Panel B
Homoskedastic
General
1
2
1
2.302 0.154 0.113 0.003
normal
kernel
Homoskedastic
General
2
1
2
1
2
1.958 0.130 0.107 0.003
2.447 0.165 0.119 0.003
1.818 0.122 0.089 0.002
2.229 0.149 0.110 0.002
1.827 0.123 0.101 0.003
2.331 0.157 0.119 0.003
1.139 0.076 0.056 0.001
1.072 0.072 0.063 0.002
1.239 0.084 0.064 0.002
1.113 0.075 0.055 0.001
1.160 0.078 0.057 0.001
1.098 0.072 0.064 0.002
1.159 0.078 0.065 0.002
1.248 0.084 0.06 1 0.001
2.366 0.129 0.101 0.002
1.760 0.095 0.083 0.002
1.716 0.115 0.084 0.002
1.234 0.083 0.061 0.001
2.813 0.152 0.115 0.002
1.603 0.090 0.08 1 0.002
0.10 quantile 1.982 0.133 0.097 0.002 0.50 quantile 1.105 0.074 0.054 0.001 0.90 quantile 1.508 0.101 0.074 0.002
Panel C  Other estimators
SB (4 0.10
0s (do
DMB
EB
2.0
0.10
0.01
E
A
E
A
2.383 0.160 0.117 0.003
2.392 0.160 0.118 0.003
2.366 0.159 0.116 0.003
2.883 0.185 0.146 0.004
2.397 0.169 0.119 0.003
2.337 0.165 0.117 0.003
2.308 0.165 0.116 0.003
1.143 0.077 0.056 0.001
1.054 0.071 0.052 0.001
1.060 0.07 1 0.052 0.001
1.205 0.095 0.067 0.002
1.105 0.080 0.067 0.002
1.034 0.075 0.055 0.001
1.033 0.075 0.055 0.001
1.065 0.071 0.052 0.001
1.052 0.07 1 0.052 0.001
1.151 0.077 0.057 0.001
1.568 0.099 0.065 0.002
1.460 0.087 0.065 0.001
1.182 0.086 0.063 0.001
1.180 0.086 0.063 0.001
0.10 quantile 2.23 I 0.150 0.110 0.002 0.50 quantile 1.064 0.07 1 0.052 0.001 0.90 quantile 1.314 0.088 0.065 0.001
Note: The population is described in Section 3. The numbers 1 and 2 for the kernel estimators refer to one and twoside kernels with databased choice for the bandwidth. m denotes the ratio between the bootstrap and sample sizes. The DMB, EB, and SB use bootstrap sample sizes of 10,000 observations and 100 repetitions. For OS estimator, CI= 0.10, 0.01 correspond to Z, _o,z = 1.645, 2.576.
M. BuchinskylJournal
Table 15 Population General
standard uniform
Oneside
errors
335
of Econometrics 68 (1995) 303338
for the original
CPS data
kernel
General
normal
kernel
DMB
Twoside
Oneside
Twoside
E
A
2.360 0.148 0.131 0.003
2.249 0.157 0.125 0.003
2.43 1 0.162 0.124 0.003
2.535 0.171 0.136 0.003
2.151 0.146 0.131 0.003
1.112 0.078 0.062 0.002
1.215 0.081 0.067 0.002
1.201 0.082 0.062 0.002
1.202 0.090 0.065 0.002
1.202 0.090 0.065 0.002
1.761 0.092 0.088 0.002
2.012 0.107 0.092 0.002
1.852 0.100 0.085 0.002
2.352 0.132 0.093 0.002
1.638 0.094 0.078 0.002
0.10 quantile 2.503 0.166 0.150 0.004 0.50 quantile 1.203 0.079 0.069 0.002 0.90 quantile 2.029 0.103 0.098 0.002
Note: See note to Table
14.
are reported in Tables 14 and 15, respectively.” The bootstrap DMB, EB, and SB estimates utilize 100 bootstrap repetitions. For the first two estimates the bootstrap sample size is 10,000, while for the SB several bootstrap sample sizes are considered. Table 14 shows that the standard errors for the various methods are quite similar. In particular, the twoside normal kernel estimates (see panel B) are close to the DMB estimates (see panel C). The OS, and EB estimators yield estimates which are similar to the DMB estimate at the middle quantile, but less at the extreme quantiles. Moreover, the SB estimator is not sensitive to the bootstrap sample size, and the OS is little affected by the choice of CL Table 15 for the original CPS shows that the standard errors obtained by all three methods are similar, although the DMB estimates tend to be larger at the extreme quantiles. As expected, the estimates are in general larger than for the independent CPS data.
r9A similar exercise for the censored quantile regression reported here for the quantile regression model.
model
yielded
results
similar
to those
336
M. BuchinskylJournal
of Econometrics 68 (1995) 303338
Comparison of the standard errors in Tables 14 and 15 indicates that differences in the performance of the various estimators can be mostly attributed to distinct small sample properties.
8.
M. BuchinskylJournal
of Econometrics 68 (1995) 303338
337
activity working or a job; they are selfemployed; and their usual earnings (usual earnings divided usual weekly must be than $1 less than Four variables used in simulations: usual earnings, education as: last attended minus minus another if last has not completed), and experience (defined min(age age  6)). programs are in MATLAB. random number is that by MATLAB MATLAB User’s pp. 3158). computations all tables run consecutively. seed number the first in the table was to zero.
References Barrodale, 1. and F. Roberts, 1973, An improved algorithm for discrete I, linear approximation, SIAM Journal of Numerical Analysis IO, 8399848. Bickel, P., 1973, On some analogues to linear combinations of order statistics in the linear model, Annals of Statistics 4, 597616. Bickel, P. and D. Freedman, 1981. Some asymptotic theory of the bootstrap, Annals of Statistics 9, 119661217. Bloch. D.A. and J.L. Gastwirth, 1968, On a simple estimate of the reciprocal of the density function. Annals of Mathematical Statistics 39, 108331085. Bofinger, E., 1975, Estimation of density function using order statistics, Australian Journal of Statistics 17. 17. Buchinsky, M., 1991, Methodological issues in quantile regression, Ch. 1 in: The theory and practice of quantile regression, Ph.D. dissertation (Harvard University, Cambridge, MA). Buchinsky, M., 1992, Bootstrapping quantile regression model, Unpublished manuscript (Yale University, New Haven, CT). Buchinsky, M., 1994, Changes in the U.S. wage structure 196331987: Application of quantile regression, Econometrica 62, 405458. Carroll, R. and D. Ruppert, 1984, Power transformations when fitting theoretical models to data. Journal of the American Statistical Association 79, 321328. Chamberlain, G., 1994, Quantile regression, censoring and the structure of wage, in: C. Sims and J.J. Laffont, eds., Proceedings of the sixth world congress of the Econometric Society. Barcelona, Spain (Cambridge University Press, New York, NY). Dielman, T. and R. Pfaffenberger, 1986, Bootstrapping in least absolute value regression: An application to hypothesis testing, American Statistical Association, Proceedings to the Business and Statistics Section, 6288630. Dielman, T. and R. Pfaffenberger, 1988, Bootstrapping in least absolute value regression: An application to hypothesis testing, Communications in Statistics B, Simulation and Computation 17,843856. Efron, B., 1979, Bootstrap methods: Another look at the jackknife, Annals of Statistics 7, l26. Efron, B., 1982, The jackknife, the bootstrap and other resampling plans (Society for Industrial and Applied Mathematics, Philadelphia, PA). Freedman, D., 1981, Bootstrapping regression models, Annals of Statistics 9, 121881228. Hahn, J., 1992, Bootstrapping quantile regression models, Unpublished manuscript (Harvard University, Cambridge. MA).
338
M. Buchinsky/Journal
of Econometrics 68 (1995) 303338
Hall, P. and S.J. Sheather, 1988, On the distribution of a studentized quantile, Journal of the Royal Statistical Society B 50, 381391. Huber, P., 1967, The behavior of maximum likelihood estimates under nonstandard conditions, Proceedings of the Fifth Berkeley Symposium 4,221223. Huber, P., 1981, Robust statistics (Wiley, New York, NY). Kaplan, E. and P. Meier, 1958, Nonparametric estimation from incomplete observations, Journal of the American Statistical Association 53, 457481. Koenker, R. and G. Bassett, 1978, Regression quantiles, Econometrica 46, 3350. Koenker, R. and G. Bassett, 1982a, Robust tests for heteroskedasticity based on regression quantiles, Econometrica 50, 4361. Koenker, R. and G. Bassett, 1982b, An empirical quantile function for linear models with iid errors, Journal of the American Statistical Association 77, 4077415. Koenker, R. and V. D’orey, 1987, Computing regression quantiles, Journal of the Royal Statistical Society, Applied Statistics 36, 383393. Koenker, R. and B.J. Park, 1993, An interior point algorithm for nonlinear quantile regression, Unpublished manuscript (University of Illinois, UrbanaChampaign, IL). Lo, S.H., 1989, On some representation of the bootstrap, Probability Theory and Related Fields 82, 411418. Newey, W. and J. Powell, 1990, Efficient estimation of linear and type I censored regression models under conditional quantile restrictions, Econometric Theory 6, 295317. Osborne, M.R. and G.A. Watson, 1971, On an algorithm for discrete nonlinear I, approximation, Computer Journal 14, 1844188. Portnoy, S. and R. Koenker, 1989, Adaptive Lestimation for linear models, Annals of Statistics 17, 362381. Powell, J., 1984, Least absolute deviation estimation for the censored regression model, Journal of Econometrics 25, 3033325. Powell, J., 1986, Censored regression quantiles, Journal of Econometrics 32, 1433155. Sheather, S.J. and J.S. Maritz, 1983, An estimate of the asymptotic standard error of the sample median, Australian Journal of Statistics 25, 109122. Siddiqui, M.M., 1960, Distribution of quantiles in samples from a bivariate population, Journal of Research of the National Bureau of Standards 64 B, 1455150. Silverman B., 1986, Density estimation for statistics and data analysis (Chapman and Hall, New York, NY). Singh, K., 1981, On the asymptotic accuracy of Efron’s bootstrap, Annals of Statistics 9, 118771195. Stangenhaus, G., 1987, Bootstrap and inference procedure for I, regression, in: Y. Dodge, ed., Statistical data analysis based on /,norm and related methods (NorthHolland, New York, NY). Yang, S.S., 1985, On bootstrapping a class of differentiable statistical functionals with applications to I and Mestimates, Statistica Neerlandica 39, 375385. Womersley, R.S., 1986, Censored discrete linear I, approximation, SIAM Journal of Scientific and Statistical Computing 7, 1055122.