arXiv:2010.04327 Abstract | arXiv Analytics

arXiv:2010.04327 [cs.LG]Abstract References Reviews Resources

Bias and Variance of Post-processing in Differential Privacy

Keyu Zhu, Pascal Van Hentenryck, Ferdinando Fioretto

Published 2020-10-09Version 1

Post-processing immunity is a fundamental property of differential privacy: it enables the application of arbitrary data-independent transformations to the results of differentially private outputs without affecting their privacy guarantees. When query outputs must satisfy domain constraints, post-processing can be used to project the privacy-preserving outputs onto the feasible region. Moreover, when the feasible region is convex, a widely adopted class of post-processing steps is also guaranteed to improve accuracy. Post-processing has been applied successfully in many applications including census data-release, energy systems, and mobility. However, its effects on the noise distribution is poorly understood: It is often argued that post-processing may introduce bias and increase variance. This paper takes a first step towards understanding the properties of post-processing. It considers the release of census data and examines, both theoretically and empirically, the behavior of a widely adopted class of post-processing functions.

Categories: cs.LG, cs.AI, cs.CR

Keywords: differential privacy, post-processing, arbitrary data-independent transformations, satisfy domain constraints, adopted class

Related articles: Most relevant | Search more

arXiv:2007.00914 [cs.LG] (Published 2020-07-02)

Federated Learning and Differential Privacy: Software tools analysis, the Sherpa.ai FL framework and methodological guidelines for preserving data privacy

Nuria Rodríguez-Barroso et al.

arXiv:2206.07737 [cs.LG] (Published 2022-06-15)

Disparate Impact in Differential Privacy from Gradient Misalignment

Maria S. Esipova, Atiyeh Ashari Ghomi, Yaqiao Luo, Jesse C. Cresswell

arXiv:2109.11429 [cs.LG] (Published 2021-09-23)

Robin Hood and Matthew Effects -- Differential Privacy Has Disparate Impact on Synthetic Data

Georgi Ganev, Bristena Oprisanu, Emiliano De Cristofaro