Coder Social home page Coder Social logo

secure_ml's Introduction

secure machine learning

Defense

1. DPSGD

DPSGD is a kind of optimization algorythms which gurantee differential privacy.

M. Abadi, A. Chu, I. Goodfellow, B. McMahan, I. Mironov, K. Talwar, and L. Zhang. Deep learning with differential privacy. In 23rd ACM Conference on Computer and Communications Security(ACM CCS), pp. 308–318, 2016, https://arxiv.org/abs/1607.00133

    example
            optimizer = DPSGD(torch_model.parameters(), lr=0.01, sigma=0.3, c=0.15)

2. purification

Purification is a defense algorythm against membership inference and model inversion attacks. It may be able to work for other types of attacks. The intuition behind Purificaiton is that applying autoencoder with prediction made by ml algorythms can decrease the deviation of predicted value which is one of the biggest causes of vulnerability of machine learning.

results of label 4

with purifier without purifier
auc 0.64785575 0.881917625
std (member) 1.9158936 4.728585
std (non member) 1.7840259 4.033065

Z. Yang, B. Shao, B. Xuan, E. Chang, F. Zhang, Defending Model Inversion and Membership Inference Attacks via Prediction Purification, 2020, https://arxiv.org/abs/2005.03915

Offense

1. membership inference

I implemented the membership inference for pytorch and scikit-learn. My implementation is mainly based on this paper.

R. Shokri, M. Stronati, C. Song, V. Shmatikov. Membership inference attacks against machine learning models. 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 2017. https://arxiv.org/abs/1610.05820

    example

            shadow_models = [Net().to(device),
                             Net().to(device),
                             Net().to(device),
                             Net().to(device),
                             Net().to(device)]
            shadow_data_size = 2000
            shadow_transform = transform

            num_label = 10
            attack_models = [SVC(probability=True) for i in range(num_label)]

            mi = Membership_Inference(shadow_models, attack_models,
                                    shadow_data_size, shadow_transform)
            # train shadow models
            mi.shadow(X_test, y_test, num_itr=20)
            # train attack model
            mi.attack()

            # preds is the prediction from target (victim) model
            # label is the true label of the data
            pred_in_or_not = mi.predict_proba(preds, label)

I tested my implementation on CIFAR10 and got similar results described in the paper. I sampled 2000 data as training data for the target model, 2000 data as validation data for the target model, and 4000 data for the shadow model. The overall auc of attack model, which predict whether or not the specific data was used for the training of target model, is 0.850. The figure shows the performance of the target model and attack model. x axis represents the accuracy of target model for training data minus the accuracy of target model for test data. y axis means the auc of attack model. As the paper says, you can see that overfitting can be the main factor for the success of membership inference.

2. model inversion

Model inversion is the methods to exract the training data from the output of the model. I implemented this method for pytorch model and test it on AT&T Database of Faces described in the paper. The usage of my implementation is as follows.

    example
            mi = Model_inversion(torch_model, input_shape)
            x_result, log = mi.attack(target_label, step_size, number_of_iterations)

You can see some results on AT&T dataset.

M. Fredrikson, S. Jha, and T. Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 2015. https://dl.acm.org/doi/pdf/10.1145/2810103.2813677

3. Evasion Attack

The target model is SVM with the RBF kernel which is trained for binary classification between "3" and "7" of mnist. The performance of the model is as follows.

              precision    recall  f1-score   support

           3       0.99      0.99      0.99       229
           7       0.99      0.99      0.99       271

    accuracy                           0.99       500
   macro avg       0.99      0.99      0.99       500
weighted avg       0.99      0.99      0.99       500

I created an adversarial example which seems "7" for human, but svm can't correctly classify. The result is shown in the following picture.

The usage of my code is really simple.

    example on MNIST

        # datasets which contains only "3"
        X_minus_1 = X_train[np.where(y_train == "3")]

        # Attack_sklearn automatically detect the type of the classifier
        attacker = Attack_sklearn(clf = clf, X_minus_1 = X_minus_1,
                                  dmax =  (5000 / 255) * 2.5,
                                  max_iter = 300,
                                  gamma = 1 / (X_train.shape[1] * np.var(X_train)),
                                  lam = 10, t = 0.5, h = 10)

        # x0 is the intial ponint ("7")
        xm, log = attacker.attack(x0)

Biggio, B., Corona, I., Maiorca, D., Nelson, B., Šrndić, N., Laskov, P., Giacinto, G., Roli, F., 2013. Evasion Attacks against Machine Learning at Test Time. In ECML-PKDD 2013. https://arxiv.org/abs/1708.06131

4. Poisoning Attack

Second, I implemented a "poisoning attack" against SVM with a linear kernel. The data set is the same as section 1, and I referred to the following paper. You can see that adding only one poisoned image dramatically decreases the accuracy of SVM.

The usage of my code is as follows.

    example

        attacker = Poison_attack_sklearn(clf,
                                        X_train_, y_train_,
                                        t=0.5)

        xc_attacked, log = attacker.attack(xc, 1,
                                        X_valid, y_valid_,
                                        num_iterations=200)

Biggio, B., Nelson, B. and Laskov, P., 2012. Poisoning attacks against support vector machines. In ICML 2012. https://arxiv.org/abs/1206.6389

secure_ml's People

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.