{ "id": "1911.03030", "version": "v1", "published": "2019-11-08T03:57:41.000Z", "updated": "2019-11-08T03:57:41.000Z", "title": "Certified Data Removal from Machine Learning Models", "authors": [ "Chuan Guo", "Tom Goldstein", "Awni Hannun", "Laurens van der Maaten" ], "comment": "Submitted to AISTATS 2020", "categories": [ "cs.LG", "stat.ML" ], "abstract": "Good data stewardship requires removal of data at the request of the data's owner. This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request. Is it possible to \"remove\" data from a machine-learning model? We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with. We develop a certified-removal mechanism for linear classifiers and empirically study learning settings in which this mechanism is practical.", "revisions": [ { "version": "v1", "updated": "2019-11-08T03:57:41.000Z" } ], "analyses": { "keywords": [ "machine learning models", "certified data removal", "machine-learning model", "datas owner", "implicitly stores information" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }