PPML: Machine Learning on Data you cannot see
Repository for the tutorial on Privacy-Preserving Machine Learning (PPML
) presented at PyConDE 2022
Abstract
Privacy guarantees are one of the most crucial requirements when it comes to analyse sensitive information. However, data anonymisation techniques alone do not always provide complete privacy protection; moreover Machine Learning (ML) models could also be exploited to leak sensitive data when attacked and no counter-measure is put in place.
Privacy-preserving machine learning (PPML) methods hold the promise to overcome all those issues, allowing to train machine learning models with full privacy guarantees.
This workshop will be mainly organised in two parts. In the first part, we will explore one example of ML model exploitation (i.e. inference attack ) to reconstruct original data from a trained model, and we will then see how differential privacy can help us protecting the privacy of our model, with minimum disruption to the original pipeline. In the second part of the workshop, we will examine a more complicated ML scenario to train Deep learning networks on encrypted data, with specialised distributed federated learning strategies.
Outline
-
Introduction: Brief Intro to
PPML
and to the workshop (slides) -
Part 1: Strengthening Deep Neural Networks
- Model vulnerabilities:
- Adversarial Examples and
FGSM
(Fast Gradient Sign Method) notebook - Model Inference attack notebooks: training | reconstruction
- Adversarial Examples and
- Deep Learning with Differential Privacy
- Model Inference attack with
OPACUS
notebooks: training | reconstruction
- Model Inference attack with
- Model vulnerabilities:
-
Part 2: Primer on Privacy-Preserving Machine Learning
Note: the material has been updated after the conference, to match the flow of the presentation as delivered during the conference, as well as to incorporate feedbacks received afterwards.
Video recording of the session presented at PyCon DE
Get the material
Clone the current repository, in order to get the course materials. To do so, once connected to your remote machine (via SSH
), execute the following instructions:
cd $HOME # This will make sure you'll be in your HOME folder
git clone https://github.com/leriomaggio/ppml-pyconde.git
Note: This will create a new folder named ppml-pyconde
. Move into this folder by typing:
cd ppml-pyconde
Well done! Now you should do be in the right location. Bear with me another few seconds, following instructions reported below
Set up your Environment
To execute the notebooks in this repository, it is necessary to set up the environment.
Please refer to the Get-Ready.ipynb
notebook for a step-by-step guide on how to setup the environment, and check that all is working, and ready to go.
Note: You could run this notebook directly in VSCode, or in your existing Jupyter notebook/lab environment:
jupyter notebook Get-Ready.ipynb
Colophon
Author: Valerio Maggio (@leriomaggio
), Senior Research Associate, University of Bristol.
All the Code material is distributed under the terms of the Apache License. See LICENSE file for additional details.
All the instructional materials in this repository are free to use, and made available under the [Creative Commons Attribution license][https://creativecommons.org/licenses/by/4.0/]. The following is a human-readable summary of (and not a substitute for) the full legal text of the CC BY 4.0 license.
You are free:
- to Share---copy and redistribute the material in any medium or format
- to Adapt---remix, transform, and build upon the material
for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution---You must give appropriate credit (mentioning that your work is derived from work that is Copyright Β© Software Carpentry and, where practical, linking to http://software-carpentry.org/), provide a [link to the license][cc-by-human], and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions---You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Acknowledgment and funding
The material developed in this tutorial has been supported by the University of Bristol, and by the Software Sustainability Institute (SSI), as part of my SSI fellowship on PETs
(Privacy Enchancing Technologies).
Please see this deck to know more about my fellowship plans.
I would also like to thank all the people at OpenMined for all the encouragement and support with the preparation of this tutorial. I hope the material in this repository could contribute to raise awareness about all the amazing work on PETs it's being provided to the Open Source and the Python communities.
Contacts
For any questions or doubts, feel free to open an issue in the repository, or drop me an email @ valerio.maggio_at_gmail_dot_com