arXiv Analytics

Sign in

arXiv:1801.08120 [stat.ME]AbstractReferencesReviewsResources

Optimal Estimation of Simultaneous Signals Using Absolute Inner Product with Applications to Integrative Genomics

T. Tony Cai, Hongzhe Li, Mark G. Low, Rong Ma

Published 2018-01-24Version 1

Integrating the summary statistics from genome-wide association study (GWAS) and expression quantitative trait loci (eQTL) data provides a powerful way of identifying the genes whose expression levels are causally associated with complex diseases. A parameter that quantifies the genetic sharing (colocalisation) between disease phenotype and gene expression of a given gene based on the summary statistics is first introduced based on the mean values of two Gaussian sequences. Specifically, given two independent samples $X\sim N(\theta, I_n)$ and $Y\sim N(\mu, I_n)$, the parameter of interest is $T(\theta, \mu)=n^{-1}\sum_{i=1}^n |\theta_i|\cdot |\mu_i|$, a non-smooth functional, which characterizes the degree of shared signals between two absolute normal mean vectors $|\theta|$ and $|\mu|$. Using approximation theory and Hermite polynomials, a sparse absolute colocalisation estimator (SpACE) is constructed and shown to be minimax rate optimal over sparse parameter spaces. Our simulation demonstrates that the proposed estimates out-perform other naive methods, resulting in smaller estimation errors. In addition, the methods are robust to the presence of block-wise correlated observations due to linkage equilibrium. The method is applied to an integrative analysis of heart failure genomics data sets and identifies several genes and biological pathways that are possibly causal to human heart failure.

Related articles: Most relevant | Search more
arXiv:1601.07496 [stat.ME] (Published 2016-01-27)
Optimal Estimation for the Functional Cox Model
arXiv:1605.07244 [stat.ME] (Published 2016-05-24)
Optimal Estimation of Co-heritability in High-dimensional Linear Models
arXiv:2309.09103 [stat.ME] (Published 2023-09-16)
Optimal Estimation under a Semiparametric Density Ratio Model