{
"id": "2102.11253",
"version": "v1",
"published": "2021-02-22T18:36:37.000Z",
"updated": "2021-02-22T18:36:37.000Z",
"title": "Large-scale simultaneous inference under dependence",
"authors": [
"Jinjin Tian",
"Xu Chen",
"Eugene Katsevich",
"Jelle Goeman",
"Aaditya Ramdas"
],
"comment": "40 pages",
"categories": [
"math.ST",
"stat.ME",
"stat.TH"
],
"abstract": "Simultaneous, post-hoc inference is desirable in large-scale hypotheses testing as it allows for exploration of data while deciding on criteria for proclaiming discoveries. It was recently proved that all admissible post-hoc inference methods for the number of true discoveries must be based on closed testing. In this paper we investigate tractable and efficient closed testing with local tests of different properties, such as monotonicty, symmetry and separability, meaning that the test thresholds a monotonic or symmetric function or a function of sums of test scores for the individual hypotheses. This class includes well-known global null tests by Fisher, Stouffer and Ruschendorf, as well as newly proposed ones based on harmonic means and Cauchy combinations. Under monotonicity, we propose a new linear time statistic (\"coma\") that quantifies the cost of multiplicity adjustments. If the tests are also symmetric and separable, we develop several fast (mostly linear-time) algorithms for post-hoc inference, making closed testing tractable. Paired with recent advances in global null tests based on generalized means, our work immediately instantiates a series of simultaneous inference methods that can handle many complex dependence structures and signal compositions. We provide guidance on choosing from these methods via theoretical investigation of the conservativeness and sensitivity for different local tests, as well as simulations that find analogous behavior for local tests and full closed testing. One result of independent interest is the following: if $P_1,\\dots,P_d$ are $p$-values from a multivariate Gaussian with arbitrary covariance, then their arithmetic average P satisfies $Pr(P \\leq t) \\leq t$ for $t \\leq \\frac{1}{2d}$.",
"revisions": [
{
"version": "v1",
"updated": "2021-02-22T18:36:37.000Z"
}
],
"analyses": {
"keywords": [
"large-scale simultaneous inference",
"local tests",
"closed testing",
"well-known global null tests",
"linear time statistic"
],
"note": {
"typesetting": "TeX",
"pages": 40,
"language": "en",
"license": "arXiv",
"status": "editable"
}
}
}