Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 1 / 25 Overview Let x1 , x2 , . . . , xn be a sequence of mean-zero dependent random vectors in Rp , where xi = (xi1 , xi2 , . . . , xip )0 with 1 ≤ i ≤ n. We provide a general (non-asymptotic) theory for quantifying: ρn := sup |P(TX ≤ t) − P(TY ≤ t)| , t∈R P where TX = max1≤j≤p √1n ni=1 xij and TY = max1≤j≤p with yi = (yi1 , yi2 , . . . , yip )0 being a Gaussian vector. √1 n Pn i=1 yij Key techniques: Slepian interpolation and the leave-one-block out argument (modification of Stein’s leave-one-out method). Two examples on inference for high dimensional time series. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 2 / 25 Outline 1 Inference for high dimensional time series Uniform confidence band for the mean Specification testing on the covariance structure 2 Gaussian approximation for maxima of non-Gaussian sum M-dependent time series Weakly dependent time series 3 Bootstrap Blockwise multiplier bootstrap Non-overlapping block bootstrap Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 3 / 25 Example I: Uniform confidence band Consider a p-dimensional weakly dependent time series {xi }. Goal: construct a uniform confidence band for µ0 = EXi ∈ Rp based on the observations {xi }ni=1 with n p. Consider the (1 − α) confidence band: √ 0 p µ = (µ1 , . . . , µp ) ∈ R : n max |µj − x¯j | ≤ c(α) , 1≤j≤p where x¯ = (x¯1 , . . . , x¯p )0 = Pn i=1 xi /n is the sample mean. Question: how to obtain the critical value c(α)? Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 4 / 25 Blockwise Multiplier Bootstrap Capture the dependence within and between the data vectors. Suppose n = bn ln with bn , ln ∈ Z. Define the block sum Aij = ibn X (xlj − x¯j ), i = 1, 2, . . . , ln . l=(i−1)bn +1 0 When p = O(exp(nb )), bn = O(nb ) with 4b0 + 7b < 1 and b0 > 2b. Define the bootstrap statistic, l n 1 X TA = max √ Aij ei , 1≤j≤p n i=1 where {ei } is a sequence of i.i.d N(0, 1) random variables that are independent of {xi }. Compute c(α) := inf{t ∈ R : P(TA ≤ t|{xi }ni=1 ) ≥ α}. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 5 / 25 Some numerical results Consider a p-dimensional VAR(1) process, p xt = ρxt−1 + 1 − ρ2 t . 1 2 3 √ tj = (εtj + εt0 )/ 2, where (εt0 , εt1 , . . . , εtp ) ∼i.i.d N(0, Ip+1 ); tj = ρ1 ζtj + ρ2 ζt(j+1) + · · · + ρp ζt(j+p−1) , where {ρj }pj=1 are generated independently from U(2, 3), and {ζtj } are i.i.d N(0, 1) random variables; tj is generated from the moving average model above with {ζtj } being i.i.d centralized Gamma(4, 1) random variables. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 6 / 25 Some numerical results (Con’t) Table: Coverage probabilities of the uniform confidence band, where n = 120. ρ = 0.3 bn = 4 bn = 6 bn = 8 bn = 10 bn = 12 ρ = 0.5 bn = 4 bn = 6 bn = 8 bn = 10 bn = 12 Xianyang Zhang (Mizzou) p = 500, 1 p = 500, 2 p = 500, 3 95% 99% 95% 99% 95% 99% 89.7 92.5 94.6 95.0 94.8 97.2 98.3 99.0 99.2 99.3 90.5 91.6 91.5 91.8 91.3 97.5 97.8 97.6 97.8 97.9 90.1 91.6 92.4 91.6 92.0 97.1 97.7 97.9 97.7 97.5 76.9 87.1 91.6 92.5 93.0 92.9 96.3 98.3 98.6 99.0 83.5 87.3 88.8 89.8 90.0 94.0 96.2 96.6 97.1 97.2 83.3 87.4 89.4 89.3 90.5 93.7 95.9 96.9 97.0 97.0 Bootstrapping high dimensional vector LDHD 2014 7 / 25 Example II: Specification testing on the covariance structure For a mean-zero p-dimensional time series {xi }, define Γ(h) = Exi+h xi0 ∈ Rp×p . Consider H0 : Γ(h) = e Γ(h) versus Ha : Γ(h) 6= e Γ(h), for some h ∈ Λ ⊆ {0, 1, 2 . . . }. Special cases: 1 Λ = {0} : testing the covariance structure. See Cai and Jiang (2011), Chen et al. (2010), Li and Chen (2012) and Qiu and Chen (2012) for some developments when {xi } are i.i.d. 2 Λ = {1, 2, . . . , H} and e Γ(h) = 0 for h ∈ Λ: white noise testing. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 8 / 25 Testing for white noise Consider the white noise testing problem. Our test is given by √ T = n max max |b γjk (h)|, 1≤h≤H 1≤j,k ≤p where b Γ(h) = Pn−h 0 i=1 xi+h xi /n = (b γjk (h))pj,k =1 . Let zi = (zi,1 , . . . , zi,p2 H ) = (vec(xi+1 xi0 )0 , . . . , vec(xi+H xi0 )0 )0 ∈ Rp for i = 1, . . . , N := n − H. Suppose N = bn ln for bn , ln ∈ Z. Define l n 1 X TA = max √ Aij ei , Aij = n 1≤j≤p2 H i=1 ibn X 2H (zl,j − z¯j ), l=(i−1)bn +1 where {ei } is a sequence of i.i.d P N(0, 1) random variables that are independent of {xi }, and z¯j = N i=1 zi,j /n. Compute c(α) := inf{t ∈ R : P(TA ≤ t|{xi }ni=1 ) ≥ α}, and reject the white noise null hypothesis if T > c(α). Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 9 / 25 Some numerical results We are interested in testing, H0 : Γ(h) = 0, for 1 ≤ h ≤ L, versus Ha : Γ(h) 6= 0, for some 1 ≤ h ≤ L. Consider the following data generating processes: 1 multivariate normal: xtj = ρ1 ζtj + ρ2 ζt(j+1) + · · · + ρp ζt(j+p−1) , where {ρj }pj=1 are generated independently from U(2, 3), and {ζtj } are i.i.d N(0, 1) random variables; 2 3 1/2 multivariate ARCH model: xt = Σt t with t ∼ N(0, Ip ) and 0 , where Σ1/2 is a lower triangular matrix Σt = 0.1Ip + 0.9xt−1 xt−1 t based on the Cholesky decomposition of Σt ; p VAR(1) model: xt = ρxt−1 + 1 − ρ2 t , where ρ = 0.2 and the errors {t } are generated according to 1 . Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 10 / 25 Some numerical results (Con’t) Table: Rejection percentages for testing the uncorrelatedness, where n = 240 and the actual number of parameters is p2 × L. L=1 bn = 1 bn = 4 bn = 8 bn = 12 L=3 bn = 1 bn = 4 bn = 8 bn = 12 Xianyang Zhang (Mizzou) p = 20, 1 p = 20, 2 5% 1% 5% 1% 5% 1% 4.3 5.0 5.3 5.1 0.8 1.0 1.2 1.0 2.8 1.0 1.6 2.3 0.3 0.3 0.9 1.4 90.3 86.3 86.0 86.5 71.9 63.3 59.2 59.2 4.7 3.6 3.7 4.0 1.0 0.7 0.4 0.6 2.3 0.6 1.3 2.2 0.3 0.3 0.8 1.3 79.4 74.0 71.4 72.1 57.7 46.2 41.0 40.6 Bootstrapping high dimensional vector p = 20, 3 LDHD 2014 11 / 25 Maxima of non-Gaussian sum The above applications hinge on a general theoretical result. Let x1 , x2 , . . . , xn be a sequence of mean-zero dependent random vectors in Rp , where xi = (xi1 , xi2 , . . . , xip )0 with 1 ≤ i ≤ n. Target: approximate the distribution of n 1 X TX = max √ xij . 1≤j≤p n i=1 Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 12 / 25 Gaussian approximation Let y1 , y2 , . . . , yn be a sequence of mean-zero Gaussian random vectors in Rp , where yi = (yi1 , yi2 , . . . , yip )0 with 1 ≤ i ≤ n. Suppose that {yi } preserves the autocovariance structure of {xi }, i.e., cov(yi , yj ) = cov(xi , xj ). Goal: quantify the Kolmogrov distance ρn := sup |P(TX ≤ t) − P(TY ≤ t)| , t∈R where TY = max1≤j≤p Xianyang Zhang (Mizzou) √1 n Pn i=1 yij . Bootstrapping high dimensional vector LDHD 2014 13 / 25 Existing results in the independent case Question: how large p can be in relation with n so that ρn → 0? Bentkus (2003): ρn → 0 provided that p7/2 = o(n). Chernozhukov et al. (2013): ρn → 0 if p = O(exp(nb )) with b < 1/7 (an astounding improvement). Motivation: study the interplay between the dependence structure and the growth rate of p so that ρn → 0. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 14 / 25 Dependence Structure I: M-dependent time series A time series {xi } is called M-dependent if for |i − j| > M, xi and xj are independent. Under suitable restrictions on the tail of xi and weak dependence assumptions uniformly across the components of xi , we show that ρn . M 1/2 (log(pn/γ) ∨ 1)7/8 + γ, n1/8 for some γ ∈ (0, 1). 0 When p = O(exp(nb )) for b < 1/11, and M = O(nb ) with 4b0 + 7b < 1, we have ρn ≤ Cn−c , c, C > 0. If b0 = 0 (i.e., M = O(1)), our result allows b < 1/7 [Chernozhukov et al. (2013)]. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 15 / 25 Dependence Structure II: Physical dependence measure [Wu (2005)] The sequence {xi } has the following causal representation, xi = G(. . . , i−1 , i ), where G is a measurable function and {i } is a sequence of i.i.d random variables. Let {0i } be an i.i.d copy of {i } and define xi∗ = G(. . . , −1 , 00 , 1 , . . . , i ). The strength of the dependence can be quantified via θi,j,q (x) = (E|xij − xij∗ |q )1/q , Θi,j,q (x) = +∞ X θl,j,q (x). l=i Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 16 / 25 Bound on the Kolmogrov distance Theorem Under suitable conditions on the tail of {xi } and certain weak dependence assumptions, we have 7/8 ρn . n−1/8 M 1/2 ln −3/8 + (n1/8 M −1/2 ln ) q 1+q p X ΘqM,j,q 1 1+q + γ, j=1 where Θi,j,q = Θi,j,q (x) ∨ Θi,j,q (y ). The tradeoff between the first two terms reflects the interaction between the dimensionality and dependence; Key step in the proof: M-dependent approximation. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 17 / 25 Bound on the Kolmogrov distance (Con’t) Corollary Suppose that 1 max1≤j≤p ΘM,j,q = O(ρM ) for ρ < 1 and q ≥ 2; 2 p = O(exp(nb )) for 0 < b < 1/11. Then we have ρn ≤ Cn−c , Xianyang Zhang (Mizzou) c, C > 0. Bootstrapping high dimensional vector LDHD 2014 18 / 25 Dimension free dependence structure Question: is there any so-called “dimension free dependence structure”? What kind of dependence assumption will not affect the increase rate of p? For a permutation π(·), (xiπ(1) , . . . , xiπ(p) ) = (zi1 , zi2 ). Suppose {zi1 } is a s-dimensional time series and {zi2 } is a p − s dimensional sequence of independent variables. Assume that {zi1 } and {zi2 } are independent, and s/p → 0. Under suitable assumptions, it can be shown that for p = O(exp(nb )) with b < 1/7, ρn ≤ Cn−c , Xianyang Zhang (Mizzou) c, C > 0. Bootstrapping high dimensional vector LDHD 2014 19 / 25 Resampling Summary: for M-dependent or more generally weakly dependent time series, we have shown that ρn := sup |P(TX ≤ t) − P(TY ≤ t)| ≤ Cn−c , c, C > 0. t∈R Question: in practice the autocovariance structure of {xi } is typically unknown. How can we approximate the distribution of TX or TY ? Solution: Resampling method. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 20 / 25 Blockwise multiplier bootstrap 1 Suppose n = bn ln . Compute the block sum, Aij = ibn X xlj , i = 1, 2, . . . , ln . l=(i−1)bn +1 2 Generate a sequence of i.i.d N(0, 1) random variables {ei } and compute ln 1 X TA = max √ Aij ei . 1≤j≤p n i=1 3 Repeat step 2 several times and compute the α-quantile of TA cTA (α) = inf{t ∈ R : P(TA ≤ t|{xi }ni=1 ) ≥ α}. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 21 / 25 Validity of the blockwise multiplier bootstrap Theorem Under suitable assumptions, we have for p = O(exp(nb )) with 0 < b < 1/15, sup P(TX ≤ cTA (α)) − α . n−c , c > 0. α∈(0,1) Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 22 / 25 Non-overlapping block bootstrap 1 Let A∗1j , . . . , A∗ln j be an i.i.d draw from the empirical distribution of n {Aij }li=1 and compute l n 1 X ¯ j ), (A∗ − A TA∗ = max √ 1≤j≤p n i=1 ij 2 ¯j = A ln X Aij /ln . i=1 Repeat the above step several times to obtain the α-quantile of T A∗ , cTA∗ (α) = inf{t ∈ R : P(TA∗ ≤ t|{xi }ni=1 ) ≥ α}. Theorem Under suitable assumptions, we have with probability 1 − o(1), sup P(TX ≤ cTA∗ (α)|cTA∗ (α)) − α = o(1). α∈(0,1) Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 23 / 25 Future works 1 Choice of the block size in the blockwise multiplier bootstrap and non-overlapping block bootstrap; 2 Maximum eigenvalue of a sum of random matrices: a natural step going from vectors to matrices. Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 24 / 25 Thank you! Xianyang Zhang (Mizzou) Bootstrapping high dimensional vector LDHD 2014 25 / 25

© Copyright 2018 ExploreDoc