A Security Overview of ML Systems

Within a few scrolls we’ll go through adversarial examples, model theft, dataset poisoning and dataset protection.Maxence PrevostBlockedUnblockFollowFollowingOct 2Optimism from mn3m.info????.They are “designed to cause the model to make a mistake” (OpenAI, Attacking Machine Learning with Adversarial Examples)..We can see two big categories of defences:Reactive: where the objective is an adversarial example prior being called on by our model for an inference.Proactive: where the objective is to make the models more resilient to this kind of attack..The dataset and or the model might be confidential for their sensitive or commercial value.The tension between model confidentiality and public access motivates our investigation of model extraction attacks..Then, using gradient-based techniques adversarial examples can be generated.There’s no need for a labeled dataset, which can be expensive to produce..Here is a pseudo code describing the Jacobian data augmentation (full code available on github).def jacobian_augmentation(dataset): """ – get_label: API call on the remote oracle – alpha: step size – jacobian: returns jacobian matrix of the substitute model """ jacobian_dataset = [] for sample in dataset: label = get_label(sample) jacobian_sample = sample + alpha*sign(jacobian(substitute_model,label)) jacobian_dataset.append(jacobian_sample) return jacobian_datasetBasically, each example is augmented by adding a small variation in direction of the gradient.They emphasize that:[…] this technique is not designed to maximize the substitute DNN’s accuracy but rather ensure that it approximates the oracle’s decision boundaries with few label queries.The choice of architecture isn’t very important since we can assume some details beforehand..Dataset PoisoningDataset poisoning attacks aim at manipulating model’s behavior at test time.Poisoning 3% of a training set managed to drop the test accuracy by 11% (Certified Defenses for Data Poisoning Attacks by Steinhardt at al..2018 (Figure 1-b)An attacker first chooses a target instance from the test set; a successful poisoning attack causes this target example to be misclassified during test time..Next, the attacker samples a base instance from the base class, and makes imperceptible changes to it to craft a poison instance; this poison is injected into the training data with the intent of fooling the model into labeling the target instance with the base label at test time..If during test time the model mistakes the target instance as being in the base class, then the poisoning attack is considered successful.Poison Frogs!.If this machine is malicious it can just give you wrong results but can’t exploit your data… Unless… If we’re talking about an FH encrypted machine learning model trying to predict something, nothing guarantees you that the model is empty at first and your opponent can still do inferences on the young model (by observing boundaries decisions and such)..You should check out CryptoDL.Dataset theftIt is also possible to recover the data used at training by simply looking at the model’s output, Membership Inference Attacks Against Machine Learning Models:given a data record and black-box access to a model, determine if the record was in the model’s training dataset..To perform membership inference against a target model, we make adversarial use of machine learning and train our own inference model to recognize differences in the target model’s predictions on the inputs that it trained on versus the inputs that it did not train on.An implementation can be found here.Originally published at data-soup.gitlab.io on October 2, 2018.. More details

Leave a Reply