- Email: [email protected]

Contents lists available at SciVerse ScienceDirect

Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi

Testing for a change in covariance operator Daniela Jarušková n Czech Technical University, Prague, Czech Republic

a r t i c l e in f o

abstract

Article history: Received 26 November 2012 Received in revised form 15 March 2013 Accepted 30 April 2013 Available online 9 May 2013

The paper considers a problem of equality of two covariance operators. Using functional principal component analysis, a method for testing equality of K largest eigenvalues and the corresponding eigenfunctions, together with its generalization to a corresponding change point problem is suggested. Asymptotic distributions of the test statistics are presented. & 2013 Elsevier B.V. All rights reserved.

Keywords: Functional data Covariance operator Principal components Two-sample problem Change point problem

1. Introduction The statistical inference where the objects of interest are random functions has recently become popular, see e.g. Bosq (2000), Ferraty (2011) and Horváth and Kokoszka (2012), because objects of interest are often continuous functions. Even if the measurements are taken discretely in many points it is assumed that the observed values are some noisy measurements of a quantity that is a continuous function in reality. The problem that motivated our study came from civil engineering where the behavior of a tunnel primary lining thickness along the tunnel was studied. In any of N profiles in distances fjΔ; j ¼ 1; …; Ng from the tunnel entrance the thickness was measured in pðp bNÞ equidistant points from left to right. There are two ways we can proceed. As the distance Δ was relatively large we can suppose that our observations fX j ¼ ðX j ð1Þ; …; X j ðpÞÞT ; j ¼ 1; …; Ng form a sequence of independent random vectors. The other possibility is to approximate the multidimensional vectors by continuous smooth functions. Then, the observations fX j ðtÞ; 0≤t≤1g; j ¼ 1; …; N form a sequence of independent random functions. The basic question may be whether the basic stochastic characteristics, i.e., the mean functions mj ðtÞ ¼ EX j ðtÞ as well as covariance operators Aj defined by covariance functions Aj ðs; tÞ ¼ EX j ðtÞX j ðsÞ remain the same for all j ¼ 1; …; N or whether there exists a point j0 (a change point) such that the characteristics before and after the change point differ. The problem belongs to the change point analysis. The methods for detecting a change in mean function were studied by Aue, Gabrys et al. (2009). Here, we consider a problem of detecting a change in a covariance function supposing that m1 ðtÞ ¼ ⋯ ¼ mN ðtÞ for t∈½0; 1. Besides application in civil engineering, the problem of detecting change(s) in the stochastic behavior of functional data or random vectors with many components may be encountered in many other fields, e.g. in climatology and hydrology when stationarity of annual cycles is studied. Here, changes may be caused by all types of human activity. Heat islands of big cities not only increase mean winter temperature but also decrease its variability. Deforestation of certain areas may be a

n

Tel.: +420608532915. E-mail address: [email protected]

0378-3758/$ - see front matter & 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jspi.2013.04.011

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

1501

cause of more frequent summer floods. Many authors, e.g. Aue, Hörmann et al. (2009) or Wied et al. (2011, 2012) present applications of change point detection of covariance structures in an analysis of time series of stock indices. Horváth et al. (2010) present an application to detect breaks in a time series of credit card transactions and Galeano and Peña (2009) in price indices. We start with a two-sample decision problem on the equality of two covariance operators A and B. The problem was introduced in Benko et al. (2009) and studied by Panaretos et al. (2010). Panaretos et al. (2010) mentioned that the extension of finite dimensional procedures can lead to complications, as the infinite-dimension version of the problem constitutes an illposed inverse problem. This is also true when the operators A and B are finite-dimensional but the numbers of observations are smaller than their dimension. Panaretos et al. (2010) suggested to choose a set of functions ϕ1 ; …; ϕp and check whether 〈ϕi ; ðA−BÞϕi′ 〉 ¼ 0 for 1≤i≤p, 1≤i′≤p. It is clear that using a finite set of test functions, one cannot generally find all departures from A ¼ B if A and B are infinite dimensional. Even if their dimension is finite but very large, one would have to use a correspondingly large set of fϕi g. On the other hand, the test functions fϕi g may be chosen to detect departures from A ¼ B that are of special interest to us. Clearly, in test procedures the functions fϕi g may be chosen to be functions of observations. In our paper we decide to check whether 〈uk ; ðA−BÞuk 〉 ¼ 0 as well as 〈vk ; ðA−BÞvk 〉 ¼ 0 for k ¼ 1; …; K, where fuk g are eigenfunctions of A that correspond to K largest eigenvalues fλk g, k ¼ 1; …; K of A, and fvk g are eigenfunctions of B that correspond to K largest eigenvalues fμk g; k ¼ 1; …; K of B. This is true if and only if AK ¼ BK , where the operator AK corresponds to the function AK ¼ ∑Kk ¼ 1 λk uk ðtÞuk ðsÞ and the operator BK corresponds to the function BK ¼ ∑Kk ¼ 1 μk vk ðtÞvk ðsÞ and this is true if and only if λ1 ¼ μ1 ; …; λK ¼ μK and u1 ¼ v1 ; …; uK ¼ vK . It may happen that we are interested in detection of AK ≠BK , i.e., we consider to test the null hypothesis AK ¼ BK against the alternative AK ≠BK . If the hypothesis AK ¼ BK is rejected, we may be interested in the question which sources caused the rejection. If the hypothesis AK ≠BK is not rejected and if ∥A−AK ∥ as well as ∥B−BK ∥ are relatively small, we may conclude that if A and B differ from each other, then they differ only slightly. From many examples of principal component analysis we know that for some data a small number K of sources exist that are able to express a large proportion of total variability. The favorable situation occurs when K is known apriori from a similar type of data. For instance, analyzing covariance matrices of 365-dimensional vectors that correspond to smooth annual cycles of 16 small Czech rivers in the period 1935–1996, we have seen that the four largest principle components explained 76–83% and the five largest principle components 82–88% of the total variance. (The vectors were obtained by smoothing daily values by a kernel smoothing technique using the Epanechnikov window with a bandwidth of h¼15.) These principle components explained the most important sources of variance, i.e., the time and length of the spring high discharge period caused by snow melting, and variability of mean winter discharges. If, for example, we would like to test the stability of covariance matrices of smoothed annual cycles for a small Czech river, we would use K ¼5. If we do not have such prior information we may try to choose K based on proportions of K principle components in the estimated total variability. In change point analysis we usually test a null hypothesis H 0cp claiming that all observations have the same distribution against an alternative claiming that at some unknown time point a specific characteristic of distribution has changed. Derivation of a test statistic in change point detection usually has two steps. First, an appropriate test statistic for twosample problems for a fixed and known change point is suggested. If the change point is unknown, one can calculate such a test statistic for any possible change point 1≤j≤N so that a sequence of test statistics is obtained. Then, the test statistic for an unknown change point is a certain functional of that sequence, usually a sum or a maximum. In our paper we recommend applying a sum of weighted two-sample statistics with weights that decrease with N 1 N2 =N2 where N1 and N2 corresponds to a number of observations in the first part, respectively of the second part of the series. In the two-sample problem the asymptotic distribution under H0 of the suggested test statistics is a χ 2 distribution. The proof can be obtained similarly as in Panaretos et al. (2010) for Gaussian processes or in Fremdt et al. (2012) for nonGaussian processes. In the corresponding change point problem we show that under the null hypothesis the test statistic converges in distribution to a integral of a sum of squares of independent Brownian bridges. The paper is organized as follows. In Section 2 we suggest the test statistics for the two-sample problem in case of Gaussian processes as well as non-Gaussian processes and derive their asymptotic distribution. In Section 3 we suggest the test statistics for the corresponding change-point problem for both Gaussian processes and non-Gaussian processes and derive their asymptotic distribution. In Section 4 we present two applications. The first one comes from civil engineering and was motivation for our study of the corresponding change point problem. The second application comes from climatology and the goal of a statistic inference was a comparison of annual cycles of Milan and Padua temperature series. 2. Two-sample test We observe two independent sequences of i.i.d. zero mean processes X 1 ðtÞ; …; X N1 ðtÞ and Y 1 ðtÞ; …; Y N2 ðtÞ defined for R1 R1 t∈½0; 1 such that E 0 X 41 ðtÞ dt o ∞ and E 0 Y 41 ðtÞ dt o ∞. Let N ¼ N 1 þ N 2 . We suppose that the covariance functions Aðt; sÞ ¼ E X 1 ðtÞX 1 ðsÞ and Bðt; sÞ ¼ E Y 1 ðtÞY 1 ðsÞ are continuous functions on ½0; 12 . We denote the corresponding covariance operator of X1 defined by the kernel Aðt; sÞ by A and the covariance operator of Y1 defined by the kernel Bðt; sÞ by B Z 1 Z 1 ðAvÞðtÞ ¼ Aðt; sÞvðsÞ ds; ðBÞvðtÞ ¼ Bðt; sÞvðsÞ ds; v∈L2 ½0; 1: 0

0

1502

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

In what follows we denote L2 ½0; 1 a Hilbert space of the square integrable functions on ½0; 1 with a scalar product R1 〈f ; g〉 ¼ 0 f ðtÞgðtÞ dt and ∥ ∥ the corresponding L2 norm. For a symmetric operator K given by a kernel function Kðt; sÞ we use ﬃ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ R1 R1 2 a Hilbert–Schmid norm ∥jK∥j ¼ Kðt; sÞ dt ds , in addition to the uniform norm ∥K∥. 0 0 We estimate the covariance function Aðt; sÞ and Bðt; sÞ by N1 b sÞ ¼ 1 ∑ X i ðtÞX i ðsÞ; Aðt; N1 i ¼ 1

N2 b sÞ ¼ 1 ∑ Y i ðtÞY i ðsÞ: resp: Bðt; N2 i ¼ 1

b and B b The corresponding operators are A Z Z 1 b sÞvðsÞ ds; ðBvÞðtÞ b b Aðt; ¼ ðAvÞðtÞ ¼ 0

1

b sÞvðsÞ ds; v∈L2 ½0; 1: Bðt;

0

Following the synopsis of Bosq (2000), Chapter 1, Lemma 1.3 there exist expansions: ∞

Aðt; sÞ ¼ ∑ λk uk ðtÞuk ðsÞ; k¼1

∞

Bðt; sÞ ¼ ∑ μk vk ðtÞvk ðsÞ; t∈½0; 1; s∈½0; 1; k¼1

where fλk g; fuk ðtÞ; t∈½0; 1g are eigenelements of A and fμk g; fvk ðtÞ; t∈½0; 1g are eigenelements of B and the processes fXðtÞ; t∈½0; 1 and fYðtÞ; t∈½0; 1 have corresponding Karhunen–Loève expansions, see Bosq (2000), Chapter 1, Theorem 1.5. Clearly Z 1Z 1 λk ¼ uk ðtÞAðt; sÞuk ðsÞ dt ds ¼ 〈uk ; Auk 〉; 0 0 Z 1Z 1 μk ¼ vk ðtÞBðt; sÞvk ðsÞ dt ds ¼ 〈vk ; Bvk 〉: 0

0

For k ¼ 1; 2; … we also introduce νk ¼ 〈vk ; Avk 〉 and κk ¼ 〈uk ; Buk 〉. For a K∈N we define K

AK ðt; sÞ ¼ ∑ λi ui ðtÞui ðsÞ; i¼1

K

BK ðt; sÞ ¼ ∑ μi vi ðtÞvi ðsÞ; t∈½0; 1; s∈½0; 1 i¼1

and denote the corresponding operators AK and BK . We suppose that λ1 4 λ2 4 ⋯ 4 λK 4 λKþ1 ≥λKþ2 ≥⋯ and μ1 4 μ2 4 ⋯ 4 μK 4 μKþ1 ≥μKþ2 ≥⋯. We would like to stress that the assumption that all K þ 1 first eigenvalues fλi gKþ1 i ¼ 1 and fμi gKþ1 are distinct is crucial. i¼1 We consider a problem of testing the null hypothesis H0 claiming AK ¼ BK against the alternative A claiming AK ≠BK . As we explained in the Introduction the idea behind the decision problem is the following. It may happen that we know a priori that for some ϵ 4 0 there exists a K∈N such that ∥A−AK ∥ ¼ sup∥x∥≤1 ∥ðA−AK ÞðxÞ∥≤ϵ and ∥B−BK ∥ ¼ sup∥x∥≤1 ∥ðB− BK ÞðxÞ∥≤ϵ then supposing that for a function v0 ∈L2 ½0; 1 such that ∥v0 ∥≤1 it holds j〈v0 ; Av0 〉−〈v0 ; Bv0 〉j 4δ then j〈v0 ; Ak v0 〉−〈v0 ; Bk v0 〉j4 δ−2ϵ. General case Before we suggest the test statistic for testing the null hypothesis H0 against the alternative hypothesis A we introduce some vectors and matrices. For k ¼ 1; …; K we introduce ηuX i ðkÞ ¼ 〈uk ; X i 〉2 ; ηvX i ðkÞ ¼ 〈vk ; X i 〉2 , i ¼ 1; …; N 1 and ηuY i ðkÞ ¼ 〈uk ; Y i 〉2 , ηvY i ðkÞ ¼ 〈vk ; Y i 〉2 , i ¼ 1; …; N2 . The vectors ηuX i ¼ ðηuX i ð1Þ; …; ηuX i ðKÞÞT , i ¼ 1; …; N1 , are independent with a mean vector λ ¼ ðλ1 ; …; λK ÞT and a covariance matrix LuX . The vectors ηvX i ¼ ðηvX i ð1Þ; …; ηvX i ðKÞÞT ; i ¼ 1; …; N1 , are independent with a mean vector ν ¼ ðν1 ; …; νK ÞT and a covariance matrix LvX . The vectors ηuY i ¼ ðηuY i ð1Þ; …; ηuY i ðKÞÞT ; i ¼ 1; …; N 2 , are independent with a mean vector κ ¼ ðκ 1 ; …; κK ÞT and a covariance matrix LuY . The vectors ηvY i ¼ ðηvY i ð1Þ; …; ηvY i ðKÞÞT ; i ¼ 1; …; N2 , are independent with a mean vector μ ¼ ðμ1 ; …; μK ÞT and a covariance matrix LvY . Here we have to add one more assumption that we shall need later, i.e., we assume that the matrices LuX and LvY are positively definite. b while μ b1 ; u b 2 ; … eigenfunctions of A b1 ; v b2 ; … We denote b λ 1 ≥b λ 2 ≥⋯ eigenvalues and u b1 ≥b μ 2 ≥⋯ eigenvalues and v b For k ¼ 1; 2; … we introduce u~ k ¼ sgnð〈u b k ; uk 〉Þuk , v~ k ¼ sgnð〈v bk ; vk 〉Þvk . Now, we state basic well-known eigenfunctions of B. inequalities. For k ¼ 1; 2; … it holds b b b k −u~ k ∥≤cuk ∥jA−A∥j; jλbk −λk j≤∥jA−A∥j; ∥u pﬃﬃﬃ p ﬃﬃﬃ where cu1 ¼ 2 2ðλ1 −λ2 Þ−1 , cuk ¼ 2 2 maxfðλk−1 −λk Þ−1 ; ðλk −λkþ1 Þ−1 g for k≥2,

ð1Þ

b b bk −v~ k ∥≤cvk ∥jB−B∥j; ∥v ð2Þ jμbk −μk j≤∥jB−B∥j; pﬃﬃﬃ p ﬃﬃﬃ −1 −1 −1 v v where c1 ¼ 2 2ðμ1 −μ2 Þ , ck ¼ 2 2maxfðμk−1 −μk Þ ; ðμk −μkþ1 Þ g for k≥2, see e.g. (2.6) and (3.9) of Hall and Hosseini-Nasab (2009) or (4.43) and (4.44) of Bosq (2000). pﬃﬃﬃﬃ pﬃﬃﬃﬃ b b ∥jA−A∥j ¼ OP ð1= NÞ, p∥j Lemma 2.1. Assume that pﬃﬃﬃﬃ N 1 =N-θ∈ð0; 1Þ as pﬃﬃﬃﬃ N-∞. It holdspﬃﬃﬃﬃ ﬃﬃﬃﬃB−B∥j ¼ OP ð1= NÞ, and for b k −u~ k ∥ ¼ OP ð1= N Þ, ∥v bk −v~ k ∥ ¼ OP ð1= N Þ, jb k ¼ 1; …∥u λ k −λk j ¼ OP ð1= N Þ and jb μ k −μk j ¼ OP ð1= NÞ. Proof. The proof follows from the inequalities (1), (2), see e.g. Dauxois et al. (1982) or Bosq (2000), p. 123.

□

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

1503

b b b u, Δ b v, Δ bu, Δ b v with respective components To simplify the notation we introduce K-dimensional vectors Δ N1 N2 b BÞu b k 〉 ¼ 1 ∑ ηu ðkÞ− 1 ∑ ηu ðkÞ; b u ðkÞ ¼ 〈uk ; ðA− Δ N1 i ¼ 1 Xi N2 i ¼ 1 Y i N1 N2 b BÞv b k 〉 ¼ 1 ∑ ηv ðkÞ− 1 ∑ ηv ðkÞ; b v ðkÞ ¼ 〈vk ; ðA− Δ N1 i ¼ 1 Xi N2 i ¼ 1 Y i

b b BÞ b u b u ðkÞ ¼ 〈u b k ; ðA− bk 〉 ¼ b Δ λ k −λ~ k ; b b BÞ b v b v ðkÞ ¼ 〈v bk ; ðA− bk 〉 ¼ μ~ k −b Δ μk: b b b v ðkÞ do not depend on the sign of u b u ðkÞ and Δ b k , respectively v bk ; k ¼ 1; …; K. We follow the Notice that the values of Δ R R b k such that 01 u bk such that 01 v b k uk ≥0, respectively v bk vk ≥0. approach of Hall and Hosseini-Nasab (2009) choosing here u u u u b is ð1=N 1 ÞL þ ð1=N 2 ÞL , while the variance–covariance matrix of Δ b v is The variance–covariance matrix of Δ X Y u v v b ð1=N 1 ÞLX þ ð1=N 2 ÞLY . Finally, we introduce a K K dimensional matrix L X with the components ! ! N1 1 N1 1 N1 2 b 2 2 2 bu ðk; k′Þ ¼ 1 ∑ 〈u b b b L ; X 〉 〈 u ; X 〉 − ∑ 〈 u ; X 〉 ∑ 〈 u ; X 〉 ; i i i i k′ X N1 i ¼ 1 k N1 i ¼ 1 k N 1 i ¼ 1 k′ bu , L bv . bv , L and analogously K K dimensional matrices L X Y Y Now, we come to the basic definition. For testing H0 against A we suggest to apply the test statistic u v T 1 ¼ ðTb 1 þ Tb 1 Þ=2;

ð3Þ

where −1 b b b bu þ N 1 L bu b u ÞT N 2 L b u ¼ N N1 N2 ðΔ bu; Δ Δ X Y 2 N N N −1 −1 b b b b v bv þ 1 L bv bv þ N1 L bv b v ÞT 1 L b v ÞT N 2 L b v ¼ N N 1 N 2 ðΔ bv: Δ Δ Tb 1 ¼ ðΔ X Y X Y 2 N1 N2 N N N

b u b u ÞT Tb 1 ¼ ðΔ

1 bu 1 bu L þ L N1 X N2 Y

−1

Theorem 2.1. Assume that N1 =N-θ∈ð0; 1Þ as N-∞. Further, assume that H0 holds true, i.e., AK ¼ BK . Assume that K þ 1 first eigenvalues of A and B are distinct and moreover assume that the matrices LuX and LvY are positively definite. Then, the variable T 1 is asymptotically distributed according to the χ 2 distribution with K degrees of freedom. Proof. Suppose H0 is true, i.e., AK ¼ BK so that ðA−BÞuk ¼ 0 and ðA−BÞvk ¼ 0 for k ¼ 1; …; K. We introduce the variables −1 −1 b u ÞT 1 Lu þ 1 Lu b u ÞT N2 Lu þ N1 Lu b u ¼ N 1 N 2 NðΔ b u; T u1 ¼ ðΔ Δ Δ X Y X Y 2 N1 N2 N N N b v ÞT T v1 ¼ ðΔ

1 v 1 v L þ L N1 X N2 Y

−1

−1 b v ÞT N2 Lv þ N 1 Lv b v ¼ N1 N 2 NðΔ b v: Δ Δ X Y 2 N N N

Under H0 it holds that u1 ¼ v1 ; …; uk ¼ vk and λ1 ¼ μ1 ; …; λK ¼ μK . It means that for i ¼ 1; …; N 1 the vectors ηuX i are the same as the vectors ηvX i and for i ¼ 1; …; N2 the vectors ηuY i are the same as the vectors ηvY i . The vectors fηuX i g and fηuY i g have the same mean value vector ðλ1 ; …; λK ÞT but may have different covariance matrices LuX and LuY . b u ðkÞ ¼ Δ b v ðkÞ as follows: We express Δ 1 N1 u 1 N2 u b b b v ðkÞ ¼ 〈uk ; ðA−AÞu b u ðkÞ ¼ Δ ∑ ðηX i ðkÞ−λk Þ− ∑ ðη ðkÞ−λk Þ: Δ k 〉−〈uk ; ðB−BÞuk 〉 ¼ N1 i ¼ 1 N2 i ¼ 1 Y i As N-∞, N1 =N-θ∈ð0; 1Þ and N 2 =N-ð1−θÞ, the variable T u1 ¼ T v1 is asymptotically distributed according to a χ 2 distribution with K degrees of freedom. bu Þ−1 ¼ ððN2 =NÞL bu þ ðN 1 =NÞL bu Þ−1 and the matrix ðL bv Þ−1 ¼ ððN 2 =NÞL bv þ ðN1 =NÞL bv Þ−1 converge both in The matrix ðL X;Y X Y X;Y X Y u u −1 probability to the matrix L−1 ¼ ðð1−θÞLvX þ θLvY Þ−1 and hence the limit distribution of the statistic X;Y ¼ ðð1−θÞLX þ θLY Þ N1 N2 2N 2

u

u

u

u

v

u

b Þ−1 Δ b Þ−1 Δ b ÞT ðL b ÞT ðL b þ ðΔ b Þ NððΔ X;Y X;Y

is again the χ 2 distribution with K degrees of freedom. Now, we need to show that b bu −1 b u −1 b u b u ÞT ðL bu bu T b ¼ oP ð1Þ; N ðΔ X;Y Þ Δ −ðΔ Þ ðL X;Y Þ Δ

ð4Þ

1504

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

b b bv Þ−1 Δ bv Þ−1 Δ b v ÞT ðL b v ÞT ðL b v −ðΔ b v ¼ oP ð1Þ: N ðΔ X;Y X;Y

ð5Þ

pﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃﬃﬃ b b The limit behavior of (4) and (5) follow from the fact that ∥jA−A∥j ∥jﬃﬃﬃﬃB−B∥j ¼ OP ð1= N 2 Þ as pﬃﬃﬃﬃ ¼ OP ð1= N1 Þ as N 1 -∞ andp b b N 2 -∞. As N1 =N-θ∈ð0; 1Þ for N-∞ it holds ∥jA−A∥j ¼ OP ð1= N Þ as well as ∥jB−B∥j ¼ OP ð1= N Þ. We shall prove (4) because the proof of (5) is similar. First, we realize that b BÞu b b k 〉 ¼ 〈uk ; ðA−AÞu b b u ðkÞ ¼ 〈uk ; ðA− Δ k 〉 þ 〈uk ; ðB−BÞuk 〉; so that u

b b b ðkÞj≤∥jA−A∥j jΔ þ ∥jB−B∥j pﬃﬃﬃﬃ u b and therefore Δ ðkÞ ¼ OP ð1= N Þ. b b u ðkÞ−Δ b u ðkÞj. We may write Further, we would like to find an upper bound for jΔ

ð6Þ

b b BÞ b BÞu b u b k〉 b u ðkÞ ¼ 〈u b u ðkÞ−Δ b k ; ðA− b k 〉−〈uk ; ðA− Δ b b BÞu b b k〉 b k −uk Þ〉 þ 2〈ðu b k −uk Þ; ðA− b k −uk Þ; ðA−BÞðu ¼ 〈ðu b b u b k −uk Þ; ðA−AÞð b k −uk Þ〉 þ 〈ðu b k −uk Þ; ðB−BÞð b k −uk Þ〉 ¼ 〈ðu u b b þ〈ðu k −uk Þ; ðA−BÞðu k −uk Þ〉 b b k〉 b k −uk Þ; ðA−AÞu b k −uk Þ; ðB−BÞu þ2〈ðu k 〉 þ 2〈ðu b k −uk Þ; ðA−BÞuk 〉: þ2〈ðu Notice that the last term is equal to zero. Denote C u ¼ maxðcu1 ; …; cuK Þ, where cu1 ; …cuk were introduced in (1). We may write b b u ðkÞj b u ðkÞ−Δ jΔ b b b k −uk ∥2 ∥jB−B∥j b k −uk ∥2 ∥jA−B∥j b k −uk ∥2 ∥jA−A∥j þ ∥u þ ∥u ≤∥u b b b k −uk ∥ ∥jA−A∥j þ 2∥u b k −uk ∥ ∥jB−B∥j þ 2∥u 3 2 b 2 b b b ≤ðC u Þ2 ð∥jA−A∥j þ ∥jA−A∥j ∥jB−B∥j þ ∥jA−A∥j ∥jA−B∥jÞ u 2 b b b þ 2C ð∥jA−A∥j þ ∥jA−A∥j ∥jB−B∥jÞ:

ð7Þ

2 b ∥jA−B∥j depends on the ∥jA−B∥j that may be non-zero. However, for A and B fixed the Notice that the term ∥jA−A∥j b b b u ðkÞ−Δ b u ðkÞ ¼ OP ð1=NÞ. This also means that ðΔ b u ðkÞ−Δ b u ðkÞÞ2 ¼ OP ð1=N2 Þ. difference Δ Finally, we will find an upper bound for

b b b b b bu Þ−1 Δ bu Þ−1 Δ bu Þ−1 ðΔ bu Þ−1 ðΔ b u ÞT ðL b u ÞT ðL b u −Δ b u ÞT ðL b u −Δ b u Þ þ 2ðΔ b u ÞT ðL b u −Δ b u Þ: b u −ðΔ b u ¼ ðΔ ðΔ X;Y X;Y X;Y X;Y It holds b b b bu Þ−1 ðΔ bu Þ−1 ðΔ b u −Δ b u ÞT ðL b u −Δ b u Þ þ 2ðΔ b u ÞT ðL b u −Δ b u Þj≤ jðΔ X;Y X;Y !2 !1=2 K K b b bu Þ−1 ∥ ∑ ðΔ b u ðkÞ−Δ b u ðkÞ þ2 ∑ ðΔ b u ðkÞ−Δ b u ðkÞÞ2 ∥ðL X;Y

k¼1

k¼1

K

!1=2 u

b ðkÞÞ2 ∑ ðΔ

k¼1

:

bu þ ðN1 =NÞL bu Þ−1 converges in probability to the matrix ðð1−θÞLu þ θLu Þ−1 and therefore its bu Þ−1 ¼ ððN2 =NÞL The matrix ðL X;Y X Y X Y norm is bounded in probability. Therefore, it holds pﬃﬃﬃﬃ b b bu Þ−1 Δ bu Þ−1 Δ b u ÞT ðL b u ÞT ðL b u −ðΔ b u Þ ¼ OP ð1= NÞ NððΔ X;Y X;Y and similarly also pﬃﬃﬃﬃ b b bv Þ−1 Δ bv Þ−1 Δ b v ÞT ðL b v ÞT ðL b v −ðΔ b v Þ ¼ OP ð1= N Þ: NððΔ X;Y X;Y

□

For studying the behavior of T 1 under the alternative A we need the following lemma. Lemma 2.2. Let AK ≠BK then there exists k≤K such that 〈uk ; ðA−BÞuk 〉≠0 or 〈vk ; ðA−BÞvk 〉≠0. Proof. The assertion will be proved by contradiction. Suppose that 〈uk ; ðA−BÞuk 〉 ¼ 0 and 〈vk ; ðA−BÞvk 〉 ¼ 0 for k ¼ 1; …; K. Denote Δ1 ¼ λ1 −λ2 , Δ2 ¼ λ2 −λ3 , … and δ1 ¼ μ1 −μ2 , δ2 ¼ μ2 −μ3 ; …. Clearly Δk 4 0, k ¼ 1; …; K, Δk ≥0; k ¼ K þ 1; … and 2 2 ∞ δk 40; k ¼ 1; …; K, δk ≥0; k ¼ K þ 1; …. For any x∈L2 ½0; 1 it holds ∥x∥2 ¼ ∑∞ i ¼ 1 ð〈x; u〉Þ ¼ ∑i ¼ 1 ð〈x; v〉Þ . 2 ∞ From the equality 〈u1 ; Au1 〉 ¼ 〈u1 ; Bu1 〉 it holds λ1 ¼ ∑i ¼ 1 μi ð〈u1 ; vi 〉Þ ≤μ1 and from symmetry λ1 ≥μ1 , so that λ1 ¼ μ1 . Further, λ1 ¼ μ1 ð〈u1 ; v1 〉Þ2 þ ðμ1 −Δ1 Þð〈u1 ; v2 〉Þ2 þ ðμ1 −Δ1 −Δ2 Þð〈u1 ; v3 〉Þ2 þ ⋯: Then 〈u1 ; v1 〉 ¼ 1 and 〈u1 ; vi 〉 ¼ 0 for i 4 1, it follows u1 ¼ v1 .

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

1505

2 From the equality 〈u2 ; Au2 ðsÞ〉 ¼ 〈u2 ; Bu2 〉 it holds (as 〈u2 ; u1 〉 ¼ 〈u2 ; v1 〉 ¼ 0) λ2 ¼ λ1 ð〈u2 ; u1 〉Þ2 þ ∑∞ i ¼ 2 μi ð〈u2 ; vi 〉Þ ¼ and from the symmetry λ2 ¼ μ2 and u2 ¼ v2 . Sequentially we get λk ¼ μk and uk ¼vk for k ¼ 1; 2; …; K. □

2 ∑∞ i ¼ 2 μi ð〈u2 ; vi 〉Þ ≤μ2

Theorem 2.2. Assume that N1 =N-θ∈ð0; 1Þ as N-∞. Assume that K þ 1 first eigenvalues of A and B are distinct and moreover assume that the matrices LuX and LvY are positively definite. We consider the alternative A claiming AK ≠BK . Then, the test based on the test statistic T 1 is consistent. Proof. The proof is based on Lemma 2.2 claiming that at least one of the vectors Δu ¼ ð〈u1 ; ðA−BÞu1 〉; …; 〈uK ; ðA−BÞuK 〉ÞT , Δv ¼ ð〈v1 ; ðA−BÞv1 〉; …; 〈vK ; ðA−BÞvK 〉ÞT is a non-zero vector. Moreover, according to our assumption the matrices LuX and LvY are both positively definite. Using again Lemma 2.1 we obtain in the same way as in Fremdt et al. (2012), Theorem 3, the following convergences: 1 bu P T ⟶θð1−θÞðΔu ÞT ðð1−θÞLuX þ θLuY Þ−1 Δu ; N 1 1 bv P T ⟶θð1−θÞðΔv ÞT ðð1−θÞLvX þ θLvY Þ−1 Δv : □ N 1

Gaussian case If the processes fX i ðtÞg and fY i ðtÞg are Gaussian, then the situation becomes simple because var ηuX 1 ðkÞ ¼ 2λ2k , var ηvY 1 ðkÞ ¼ 2 μ2k , k ¼ 1; …; K and covðηuX 1 ðkÞ; ηuX 1 ðk′ÞÞ ¼ 0, covðηvY 1 ðkÞ; ηvY 1 ðk′ÞÞ ¼ 0, k≠k′. It means that the matrix LuX and LvY are diagonal with diagonals ð2λ21 ; …; 2λ2K Þ, resp. ð2 μ21 ; …; 2 μ2K Þ. Therefore, they are positively definite. Moreover, under H0 the matrices LuX and LvY are the same, i.e., LuX ¼ LvY : It follows that for Gaussian processes we may use for testing H0 against A the test statistic K

u

v

T 2 ¼ ∑ ððTb 2 ðkÞÞ2 þ ðTb 2 ðkÞÞ2 Þ=2;

ð8Þ

k¼1

where u Tb 2 ðkÞ ¼

pﬃﬃﬃﬃ N

sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ N1 N2 2N 2

b b u ðkÞ Δ N1 b λ k þ N2 λ~ k N

v Tb 2 ðkÞ ¼

and

N

sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ b b v ðkÞ pﬃﬃﬃﬃ N 1 N 2 Δ N ; bk 2 N 2 NN1 μ~ k þ NN2 μ

bv bu bk ; B b k 〉, μ~ k ¼ 〈v bk ; A bk 〉. where λ~ k ¼ 〈u Theorem 2.3. Assume that N1 =N-θ∈ð0; 1Þ as N-∞. Further, assume that H0 holds true, i.e., AK ¼ BK . Assume that K þ 1 first eigenvalues of A and B are distinct. Then, the variable T 2 is asymptotically distributed according to the χ 2 distribution with K degrees of freedom. Theorem 2.4. Assume that N1 =N-θ∈ð0; 1Þ as N-∞. Assume that K þ 1 first eigenvalues of A and B are distinct. We consider the alternative A claiming that AK ≠BK . Then, the test based on the test statistic T2 is consistent. More specifically, it holds ! pﬃﬃﬃﬃpﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 〈uk ; ðA−BÞuk 〉 u b ¼ OP ð1Þ; T 2 ðkÞ− N θð1−θÞ pﬃﬃﬃ 2ðθλk þ ð1−θÞκk Þ ! pﬃﬃﬃﬃpﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 〈vk ; ðA−BÞvk 〉 v b T 2 ðkÞ− N θð1−θÞ pﬃﬃﬃ ¼ OP ð1Þ: 2ðθνk þ ð1−θÞμk Þ Remark 2.1. Clearly an analogous test statistic to (3) or (8) may be applied in the situation when our observations are two sequences of i.i.d. p-dimensional random vectors. The considered Hilbert space Rp is the space of p-dimensional vectors with a scalar product 〈x; y〉 ¼ xT y. Any symmetric operator A on this space is represented by a nonnegative symmetric matrix A by 〈y; ðAxÞ〉 ¼ yT Ax. Panaretos et al. (2010) considered the problem of testing A ¼ B against A≠B in the Gaussian case. They suggest to apply the test statistic K

K

K

K

i−1

∑ ∑ ðJ N ði; kÞÞ2 ¼ ∑ ðJ N ðk; kÞÞ2 þ 2 ∑ ∑ ðJ N ði; kÞÞ2 ;

k¼1i¼1

k¼1

i¼1k¼1

where J N ði; kÞ ¼

rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ b BÞ b ϕ b i ; ðA− bk〉 N1 N2 〈ϕ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ﬃ; 2N bϕ bϕ b i 〉〈ϕ bk; C bk〉 bi; C 〈ϕ

ð9Þ

1506

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

b with the kernel b k ðtÞ; t∈½0; 1g are eigenfunctions of the operator C where the functions fϕ ! N1 N2 b ðt; sÞ ¼ 1 ∑ X i ðtÞX i ðsÞ þ ∑ Y i ðtÞY i ðsÞ C N i¼1 i¼1 pﬃﬃﬃ It is clear that under H0 the statistics fJ N ðk; kÞ; k ¼ 1; …; Kg; f 2J N ði; kÞ; 1≤i o k≤Kg are asymptotically uncorrelated and have Nð0; 1Þ distribution so that (9) has asymptotically χ 2 distribution with K ðK þ 1Þ=2 degrees of freedom. The idea of using K eigenfunctions corresponding to K largest eigenvalues of the pooled sample covariance operators as a set of test functions was also applied by Horváth et al. (2010) or by Fremdt et al. (2012). One may ask how to choose K. Fremdt et al. (2012) suggested to choose K in a way that ∑Ki¼ 1 b λ i =∑all i b λ i as well as ∑Ki¼ 1 μ bi =∑all i μ bi are larger than 0.85–0.90. The following example shows that the described procedure do not reject A ¼ B even if AK ≠BK and ∑Ki¼ 1 λi =∑all i λi as well as ∑Ki¼ 1 μi =∑all i μi are both larger than 0.9. Let the operators A and B be represented by the matrix A, resp. B: 10 0 10 0 A¼ ; B¼ : 0 1 0 90 The first principal component explains 90% of total variability of A and the same is true for B. Therefore, according to the described rule p¼1. Supposing N1 ¼ 0:9N and N2 ¼ 0:1 N, the covariance matrix of the pooled sample 10 0 C ¼ 0:9A þ 0:1B ¼ 0 9:9 has the largest eigenvalue 10 and the corresponding eigenvector ð1; 0ÞT . As ð1; 0ÞðA−BÞð1; 0ÞT ¼ 0, the procedure with p ¼1 does not detect that A1 ≠B1 (and A≠B) even if N is very large. Clearly, the test statistics (3) and (8) are aimed to detect AK ≠BK while they may not be able to detect other departures from A ¼ B. The test of Panaretos et al. (2010) is not always able to detect AK ≠BK (and indeed also A≠B) even if AK explains a large proportion of variability of A, and similarly BK of B. u Remark 2.2. Simulations show that in spite of the fact that under H0 the variables Tb 2 ðkÞ have asymptotically zero mean u v normal distribution, they may have a bias even for N1 and N2 relatively large. The reason why for instance Tb 2 ð1Þ (and Tb 2 ð1Þ) b 1 is a eigenfunction may have a bias under H0 can be seen from the following equalities. We suppose that A ¼ B and u b It holds corresponding to the largest eigenvalues of A.

b BÞ bu bu b 1 〉Þ þ Eð〈u1 ; Bu1 〉−〈u b u bu b 1 〉 ¼ E〈u b1 ; A b 1 〉−〈u1 ; Au1 〉 þ 〈u1 ; Bu1 〉−E 〈u b1 ; B b 1 〉 ¼ Eð〈u b1 ; A b 1 〉−〈u1 ; Au b 1 ; Bu b 1 〉Þ: b 1 ; ðA− E〈u Clearly, the variables in both parentheses are non-negative. u We proved that the asymptotic bias of Tb 2 ðkÞ; k ¼ 1; 2; … depends strongly on distances between eigenvalues as it holds rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pﬃﬃﬃﬃ u λj 1 ð1−θÞ þ oð1= NÞ as N-∞: ETb 2 ðkÞ ¼ pﬃﬃﬃﬃ 2∑ 2θ λ −λ N j j:j≠k k The proof that may be based on the results presented by Hall and Hosseini-Nasab (2006, 2009) can be received from the author on request. 3. Detection of a change General case At time points i ¼ 1; …; N we observe a sequence of independent zero mean processes fX 1 ðtÞg; …; fX N ðtÞg defined on ½0; 1 R1 such that E 0 X i ðtÞ4 o ∞. We denote the covariance operator of fX i ðtÞg defined by a kernel Ai ðt; sÞ ¼ E X i ðtÞX i ðsÞ by Ai . The operators Ai;K are defined analogously to the definition of AK in the previous section. The null hypothesis H 0cp claims that the distributions of all fX i ðtÞg; i ¼ 1; …; N are equal. In such a case we denote by A its covariance operator and fuk g its eigenfunctions. (Notice that A ¼ A1 ¼ ⋯ ¼ AN .) We test the hypothesis H 0cp against the alternative hypothesis Acp claiming that there exists a time point j such that Ai;K ¼ AK for all i≤j and Ai;K ¼ BK for all i4 j, where AK ≠BK . cj and A b n are given by the respective kernels The operators A j j b j ðt; sÞ ¼ 1 ∑ X i ðtÞX i ðsÞ; A j i¼1

N b n ðt; sÞ ¼ 1 ∑ X i ðtÞX i ðsÞ A j N−j i ¼ jþ1

b j and the functions b kj ðtÞ; k ¼ 1; …; Kg are eigenfunctions corresponding to K largest eigenvalues of A The functions fu n b b fv kj ðtÞ; k ¼ 1; …; Kg are eigenfunctions corresponding to K largest eigenvalues of A j . b u b bvΔ b v ; j ¼ 1; …; N−1, with the respective components bu, Δ b , Δ Similarly as before we introduce the vectors Δ j j j j b b n n n u v u b j −A b Þu b j −A b Þv b j −A b Þuk 〉, Δ b j −A b n Þvk 〉; k ¼ 1; …; K. Notice that b ðkÞ ¼ 〈u b ðkÞ ¼ 〈v b ðkÞ ¼ 〈uk ; ðA b v ðkÞ ¼ 〈vk ; ðA b kj ; ðA b kj 〉, Δ bkj ; ðA bkj 〉, Δ Δ j j j j j j j j

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

1507

b be a matrix with the components bu ¼ Δ b v , j ¼ 1; …; N−1. Further, let L under H 0cp it holds Δ j j ! ! 1 N 1 N 1 N 2 2 2 2 b b ; X 〉 〈u b k′N ; X i 〉 − b ;X 〉 b ;X 〉 ; ∑ 〈u ∑ 〈u ∑ 〈u Lðk; k′Þ ¼ N i ¼ 1 kN i N i ¼ 1 kN i N i ¼ 1 k′N i k; k′ ¼ 1; …; K. For testing H 0cp against Acp we suggest to apply 0 u 1 v 1 N j N−j @Tb 1j þ Tb 1j A ∑ ; Nj¼1N N 2

ð10Þ

where u v Tb 1j þ Tb 1j

2

¼

b −1 b b u j N−j N b b b−1 Δ b Δ b v ÞT L b u þ ðΔ bvÞ ððΔ j ÞT L j j j N N 2

Theorem 3.1. Suppose H0cp holds true, i.e., distributions of fX i ðtÞg; i ¼ 1; …; N, are equal, with the covariance operator A and its eigenfunctions fuk g. Moreover, suppose that K+1 first eigenvalues of A are distinct and the matrix L with the components Lðk; k′Þ ¼ E〈uk ; X 1 〉2 〈uk′ ; X 1 〉2 −E〈uk ; X 1 〉2 E〈uk′ ; X 1 〉2 , k; k′ ¼ 1; …; K, is positively definite. Then, it holds 0 u 1 v Z 1 K 1 N j N−j @Tb 1j þ Tb 1j A D ∑ ∑ B2k ðtÞ dt; Nj¼1N N 2 0 k¼1 where fðB1 ðtÞ; …; BK ðtÞÞg is a K-dimensional Gaussian process whose components are independent Brownian bridges. Proof. Suppose that H 0cp holds true, i.e., all covariance operators Ai ; i ¼ 1; …; N are equal to the same operator A. Then, it holds 2 Z 1 K 1 N j N−j 2 b u T b−1 b u D ∑ NðΔ j Þ L Δ j ∑ B2k ðtÞ dt; Nj¼1 N N 0 k¼1 see e.g. Horváth et al. (1999). Therefore, it is sufficient to prove that under H 0cp it holds that 2 b −1 b j N−j 2 b Δ b−1 Δ b u ÞT L b u ÞT L b u −ðΔ b u Þ ¼ oP ð1Þ; max NððΔ j j j j N 1≤j≤N−1 N 2 b −1 b j N−j 2 b Δ b−1 Δ b v ÞT L b u ÞT L b v −ðΔ b u Þ ¼ oP ð1Þ: NððΔ j j j j N 1≤j≤N−1 N

ð11Þ

ð12Þ

max

We shall prove (11). For j ¼ 1; …; N it holds n

u

b j −A∥j þ ∥jA b −A∥j; b ðkÞj≤∥jA max jΔ j j

1≤k≤K

see (6). Moreover, see (7), it holds that b b j −A∥j3 þ ∥jA b j −A∥j2 ∥jA b n −A∥jÞ b u ðkÞ−Δ b u ðkÞj≤ðC u Þ2 ð∥jA max jΔ j j j

1≤k≤K

n

b j −A∥j2 þ ∥jA b j −A∥j ∥jA b −A∥jÞ: þ2C u ð∥jA j Further, we use the following upper bound valid for all j ¼ 1; …; N−1: !2 2 2 K b u T −1 b u u T −1 u u u j N−j 2 b b j N−j 2 b−1 b b b b b b b NðΔ j Þ L Δ j −ðΔ j Þ L Δ j ≤ N∥L ∥ ∑ ðΔ j ðkÞ−Δ j ðkÞ N N N N k¼1 þ2

b b u ðkÞ−Δ b u ðkÞÞ2 ∑ ðΔ j j K

k¼1

!1=2

K

b u ðkÞÞ2 Þ ∑ ðΔ j

k¼1

1=2

!

:

The bounded law of iterated logarithm, see e.g. Dehling and Philipp (1982), yields that for any β 40 it holds b j −A∥j max1≤j≤N−1 j1=2 ∥jA Nβ

-0;

b n −A∥j max1≤j≤N−1 ðN−jÞ1=2 ∥jA j Nβ

-0

almost surely and it follows that (11) holds true. In the same way (12) may be proved.

□

Clearly, if for some K the alternative hypothesis Acp, with j ¼ ½θ n; θ∈ð0; 1Þ, is true then the test based on the test statistic (10) is consistent.

1508

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

Gaussian case Suppose that at time points i ¼ 1; …; N we observe a sequence of independent zero mean Gaussian processes fX 1 ðtÞg; …; fX N ðtÞg defined on ½0; 1 with the respective covariance operators Ai defined by a kernels Ai ðt; sÞ ¼ E X i ðtÞX i ðsÞ; i ¼ 1; …; N. The null hypothesis H 0cp claiming that distributions of all fX i ðtÞg are equal coincides with the hypothesis that all covariance operators Ai ; i ¼ 1; …; N, are equal. The alternative Acp claims that there exists a time point j such that Ai;K ¼ AK for all i≤j and Ai;K ¼ BK for all i 4 j, where AK ≠BK . For solving this testing problem we may use the test statistic K 1 N j N−j ∑ ðTb uj ðkÞÞ2 þ ðTb vj ðkÞÞ2 =2; ð13Þ ∑ N N N j¼1 k¼1 where Tb uj ðkÞ ¼

rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ b j −A b n Þu b kj ; ðA b kj 〉 jðN−jÞ 〈u j ; b Nu 2N b kN 〉 b kN ; A 〈u

Tb vj ðkÞ ¼

rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ b j −A b n Þv bkj ; ðA bkj 〉 jðN−jÞ 〈v j : b Nu 2N b kN 〉 b kN ; A 〈u

Theorem 3.2. Assume that H 0cp holds true, i.e., all processes fX i ðtÞ; t∈½0; 1g, i ¼ 1; …; N, are zero mean Gaussian processes with the same covariance operator A. Assume that K þ 1 first eigenvalues of A are distinct. Then, it holds b uj ðkÞÞ2 þ ðTb vj ðkÞÞ2 D Z 1 K K 1 N j N−j ðT ∑ ∑ B2k ðtÞ dt: ð14Þ ∑ 2 0 k¼1 k¼1Nj¼1N N

Clearly, if for some K the alternative hypothesis Acp with j ¼ ½θ n; θ∈ð0; 1Þ is true then the test based on the test statistic (13) is consistent. 4. Applications 4.1. Application 1 A tunnel primary lining thickness was measured at 48 profiles in 53 equidistant points. In this way we obtained 48 realizations of 53 dimensional vectors. The data set is available on www.mat.fsv.cvut.cz/Jaruskova/. The question is whether the distribution of these vectors has changed, or more specifically, whether their variance–covariance matrix has changed. Supposing that we observe normally distributed random vectors the test statistic (13) (for K ¼12) attains the value 3.5. Supposing that they may be not normal the test statistic (10) attains the value 5.5. The 5% asymptotic critical value obtained again by Aue, Gabrys et al. (2009) is 2.95. As the values of the test statistic for K ¼12 are larger than 5% critical values we may conclude that the distribution of the studied random vectors is not the same. The example shows that the values of the test statistics (10) and (13) may be quite different. Moreover, we are afraid that due to the limited amount of data the use of the asymptotic results is here not justified. 4.2. Application 2 Our original data were daily mean temperatures measured in two stations, namely in Milan in years 1763–1998 ðN1 ¼ 236Þ and in Padua in years 1766–1982 ðN2 ¼ 217Þ. The data were organized into vectors of 365 components representing annual cycles that were smoothed by a kernel smoothing technique using an Epanechnikov window with a bandwidth of h¼25. In this way the analyzed data are two samples of 236 and 217 random functions. (We ignore the fact that two samples are mutually dependent because they were obtained in the same time period.) The original data set comes from Camuffo and Jones (2002). The smoothed versions of the series are available on www.mat.fsv.cvut.cz/Jaruskova/. Table 1 The largest eigenvalues and corresponding proportions of total variability. k

b λk

b λ i (%) λ k =∑b

∑ki ¼ 1 b λ i (%) λ i =∑b

μ bk

μ bk =∑b μ i (%)

bi =∑b μ i (%) ∑ki ¼ 1 μ

1 2 3 4 5

153.4 103.4 57.4 35.0 25.7

37 25 14 8 6

37 62 76 84 91

210.2 131.1 70.8 37.1 25.5

41 26 14 7 5

41 67 81 87 93

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

1509

Table 2 Values of test statistic (8) and corresponding p-values for K ¼ 1; …; 5. K

1

2

3

4

5

Test stat. p-values

6.32 0.012

9.46 0.009

13.98 0.003

14.19 0.007

14.28 0.014

3.5 3 2.5 2 1.5 1 0.5

0

50

100

150

200

250

300

350

400

Fig. 1. Estimates of the variance function for Milan (solid line) and for Padua (dashed line). 1 0.9999 0.9998 0.9997 0.9996 0.9995 0.9994

0

50

100

150

200

250

300

350

400

Fig. 2. Estimates of the correlation function with the lag 1 for Milan (solid line) and for Padua (dashed line).

The goal of our study is to compare the covariance operators of these two functional samples. Table 1 presents variances of five principal components together with their proportions of total variability and the sum of proportions in total variability represented by the first k components. Table 2 shows values of the test statistic (8) for K ¼ 1; …; 5 together with estimated p-values obtained from the limit χ 2 distribution. We supposed that the data are normally distributed so that we applied the test statistic (8). Choosing K¼ 1 the null hypothesis A1 ¼ B1 is rejected at the significance level α ¼ 0:05. The same is true if we choose any K between 2 and 5. We would recommend setting K ¼3 or K ¼4, as three principal components of the first as well as three principal components of the second sample explain more than 75% and 80%, respectively, of total variability. Fig. 1 represents estimates of the variance function varðXðtÞÞ for Milan, and varðYðtÞÞ for Padua ðt∈½0; 365Þ. Fig. 2 represents estimates of the correlation function with lag 1, i.e., covðXðtÞ; Xðt þ 1ÞÞ for Milan, and covðYðtÞ; Yðt þ 1ÞÞ for Padua. If we compare Milan and Padua data, we see that the estimated variance function for Padua data attains larger values than the variance function for Milan data over the whole year. Therefore, the total variability of the Padua data is larger than the total variability of the Milan data. More specifically, ∑b λ i ¼ 416 and ∑b μ i ¼ 514. It is also interesting to see that the difference between ∑b λ i ¼ 416 and ∑b μ i ¼ 514 is contained in the first three eigenvalues ∑3i ¼ 1 b λ i ¼ 314 (76% of total variability) and ∑3i ¼ 1 μ bi ¼ 414 (80% of total variability). We observe that years with cold winters are followed by years with warm winters. This is true for Milan as well as for Padua. In spite of the fact that in winter the estimated correlation functions with the lag 1 for Milan and Padua data do not differ substantially, the correlation function with the lag 1 for Padua data attains larger values everywhere except in winter. In the other words the Padua temperature oscillates around its mean annual cycle more slowly but with slightly larger amplitudes. The sample variance of Milan annual averages is 0.36, while the sample variance of Padua annual averages is 0.56. b 1 and v b1 , i.e., the first eigenfunctions of the Milan, respectively the Padua data. Similarly Fig. 4 Fig. 3 represents u b 2 and v b2 , etc. Among all pairs fu bi ; v bi g; i ¼ 1; 2; 3; 4 the difference between u b 1 and v b1 is represents the second eigenfunctions u the most striking. Both functions attain its maximum in the beginning of the year that corresponds to the fact that the main b1 source of variability comes from the large year-to-year differences in winter temperatures. However, the eigenfunction v

1510

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

0.12 0.1 0.08 0.06 0.04 0.02 0

0

50

100

150

200

250

300

350

400

b 1 (solid line) and v b1 (dashed line). Fig. 3. Eigenfunctions u

0.1 0.08 0.06 0.04 0.02 0 −0.02 −0.04 −0.06 −0.08 −0.1

0

50

100

150

200

250

300

350

400

b 2 (solid line) and v b2 (dashed line). Fig. 4. Eigenfunctions u

0.12 0.1 0.08 0.06 0.04 0.02 0 −0.02 −0.04 −0.06 −0.08

0

50

100

150

200

250

300

350

400

b 3 (solid line) and v b3 (dashed line). Fig. 5. Eigenfunctions u

0.1 0.08 0.06 0.04 0.02 0 −0.02 −0.04 −0.06 −0.08 −0.1

0

50

100

150

200

250

300

350

400

b 4 (solid line) and v b4 (dashed line). Fig. 6. Eigenfunctions u

decreases relatively slow and is more similar to a constant function. That means that “weights” assigned to “daily” values in b1 ; X i 〉 and 〈v b1 ; Y i 〉 will attain values close to the corresponding one calendar year are more equal. It means that the values of 〈v b1 has a large ability to detect difference in covariance structure of Milan annual averages. This is a reason why the function v v b k and v bk do not differ substantially. and Padua series (Tb ð1Þ ¼ −3:24.) Figs. 4–6 show that for k ¼ 2; 3; 4 the eigenfunctions u The same is true for k 4 4.

D. Jarušková / Journal of Statistical Planning and Inference 143 (2013) 1500–1511

1511

Acknowledgments The work of the author was supported by Grant GAČR 201/09/0755. I would like to thank Ivana Pultarová from Czech Technical University and Pascal Sarda from Université Paul Sabatier for help. References Aue, A., Hörmann, S., Horváth, L., Reimherr, M., 2009. Break detection in the covariance structure of multivariate time series models. Annals of Statistics 37, 4046–4087. Aue, A., Gabrys, R., Horváth, L., Kokoszka, P., 2009. Estimation of change-point in the mean function of functional data. Journal of Multivariate Analysis 100, 2254–2269. Benko, M., Härdle, W., Kneip, A., 2009. Common functional principal components. Annals of Statistics 37, 1–34. Bosq, D., 2000. Linear Processes in Functional Spaces. Springer, New York. Camuffo, D., Jones, P., 2002. Improved understanding of past climatic variability from early daily European instrumental sources. Climatic Change 53, 1–3. Dauxois, J., Pousse, A., Romain, Y., 1982. Asymptotic theory for the principal component analysis of a vector random function: some applications to statistical inference. Journal of Multivariate Analysis 12, 136–154. Dehling, H., Philipp, W., 1982. Almost sure invariance principles for weakly dependent vector-values random variables. Annals of Probability 10, 689–701. Ferraty, F. (Ed.), 2011. Recent Advances in Functional Data Analysis and Related Topics. Physica-Verlag, Berlin-Heidelberg. Fremdt, S., Steinebach, J., Horváth, L., Kokoszka, P., 2013. Testing the equality of covariance operators in functional samples. Scandinavian Journal of Statistics 40, 138–152. Galeano, P., Peña, D., 2009. Covariance changes detection in multivariate time series. Journal of Statistical Planning and Inference 137, 194–211. Hall, P., Hosseini-Nasab, M., 2006. On properties of functional principal components analysis. Journal of the Royal Statistical Society: Series B 68, 109–126. Hall, P., Hosseini-Nasab, M., 2009. Theory for high-order bounds in functional principal components analysis. Mathematical Proceedings of the Cambridge Philosophical Society 146, 225–256. Horváth, L., Kokoszka, P., 2012. Inference for Functional Data with Applications. Springer, Heidelberg. Horváth, L., Kokoszka, P., Steinebach, J., 1999. Testing for changes in multivariate dependent observations with an application to temperature changes. Journal of Multivariate Analysis 68, 96–119. Horváth, L., Hušková, M., Kokoszka, P., 2010. Testing the stability of the functional autoregressive process. Journal of Multivariate Analysis 101, 352–367. Panaretos, V.M., Kraus, D., Maddocks, J.H., 2010. Second-order comparison of Gaussian random functions and the geometry of DNA minicircles. Journal of the Acoustical Society of America 105, 670–682. Wied, D., Arnold, M., Bissantz, N., Ziggel, D., 2012. A new fluctuation test for constant variances with applications to finance. Metrika 75, 1111–1127. Wied, D., Krämer, W., Dehling, H., 2012. Testing for a change in correlation at unknown point in time using an extended functional delta method. Econometric Theory 28, 570–589.