Coder Social home page Coder Social logo

membership-inference's People

Contributors

csong27 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

membership-inference's Issues

about Trainset

i want to know that the train_feat_file & train_label_file are the train data set of Target? Should I build a Target classified model? Thank you.

Attack model for already trained model

Hi,
I am not an AI expert and I need to train your attack model. I am a little bit confused here.
Why do you train a target model? Isn't the target model the one provided by MLAAS or at least the attacker does not know about.
In other words, I have a trained model with inputs and outputs. I just need to train the shadow model and the attack model. Do I still need to train the target model again?
If not what do I have to change in the code?
Thanks

about some questions

Hello, Dr.song. I read your paper《Membership Inference Attacks Against Machine Learning Models》 the other day. I am very interested in it, but I have two questions about it. First, your attack requires the confidence values of the target model output. What if the output is not the confidence values? Second, the example you use, that is, a certain patient's clinical record was used to train a model associated with a disease determine the appropriate medicine dosage. If I input this person's information and the model outputs his medicine dosage, then this person must suffer from this disease and there is no need for member inference attacks, so what is the meaning of this article?I would appreciate your reply.

About the data

Could anybody help me out that what does 'train_feat' and 'train_label' mean? Do they serve as the training data of the target model? If so, how can I perform the experiment on datasets e.g., CIFAR?

Testing the attack model

Hello Dr. Song,
I was checking your paper and the code. I found two files are saved after your target model training. One is target_model.npz and another is attack_test_data.npz. Can you please mention which of these files is used in the attack model to evaluate the attack?

Thanks in advance.
Shuvo

About the Sample Algorithm introduced on purchase dataset

Hi, Dr. Song. I am now working on the privacy of federated learning, and I have read the article Membership Inference Attacks Against Machine Learning Models . It is very kind of you for sharing the source code.

But I have found my recurrence of attack on purchase dataset don't work well. Could you share the detail of sample algorithm of the simplified purchase dataset? I have found that each commodity are represented as dept, category, company, brand . And I am doubting in these following question :

  • If this code is for the attack on purchase dataset?
  • The primary key represent a certain commodity is composed by four parts . Do you remember which commodity feature you have used to represent the simplified primary key of each commodity ?
  • I have found that you used 600 feature from the commodity list. Can I understand like that : After random sampling on the purchase dataset, we randomly choose 600 columns of the matrix represent which commodity the user has bought. After that, we begin our k-means clustering algorithm, give each input a label?

Thank you for all your assistance.

How to optimize the parameter

Hello, Song.
I try your code via UCI adult's salary data set and cifar10. I get a result similar with yours for adult's salary data set, however for cifar10, i get the a very low accuracy. I guess maybe some parameter is need optimized, such as batch_size or epochs.
So, I wonder if you willing to publish your parameters ? And another question is that for images, I should replace the Target's and Shadow's models from NN to CNN or other?
Thank you, I really appreciate your work.

What about ML-Leaks?

Hello Song
I guess maybe you knew a new paper named ML-Leaks, improved by your works. arxiv:1806.01246. What's your opinion? Do you think ML-Leaks is a more effective works?
Thank you.

About Algorithm 1 Data Synthesis Using the Target Model

Hi Dr. Song,

Thank you for providing us with the source code of the paper. I have been reading and repeating the experiment mentioned in the paper. However, I found that all the training dataset for shadow models just using the data records disjoint from target training dataset of specific dataset (like cifar-10) or replace k features in the code or other experiment implementations, like ml-leaks, cyphercat, mia and etc. Maybe, it could be a little bit different from the original algorithm in the paper.

I wrote the Algorithm 1: Data synthesis using the target model by myself using Pytorch. I generated a random tensor as size of (1, 3, 32, 32) for cifar-10 dataset and use two phases-search and sample as the algorithm in the paper. The code is as below:

def data_synthesize(net, trainset_size, fix_class, initial_record, k_max,
                    in_channels, img_size, batch_size, num_workers, device):
    """
    It is a function to synthesize data
    """
    # Initialize X_tensor with an initial_record, with size of (1, in_channels, img_size, img_size)
    X_tensor = initial_record
    # Generate y_tensor with the size equivalent to X_tensor's
    y_tensor = gen_class_tensor(trainset_size, fix_class)

    y_c_current = 0         # target models probability of fixed class
    j = 0                   # consecutive rejections counter
    k = k_max               # search radius
    max_iter = 100          # max iter number
    conf_min = 0.1          # min probability cutoff to consider a record member of the class
    rej_max = 5             # max number of consecutive rejections
    k_min = 1               # min radius of feature perturbation

    for _ in range(max_iter):

        dataset = TensorDataset(X_tensor, y_tensor)
        dataloader = DataLoader(dataset=dataset, batch_size=batch_size, num_workers=num_workers, shuffle=True)

        y_c = nn_predict_proba(net, dataloader, device, fix_class)

        # Phase 1: Search
        if y_c >= y_c_current:
            # Phase 2: Sample
            if y_c > conf_min and fix_class == torch.argmax(nn_predict(net, dataloader, device), dim=1):
                return X_tensor

            X_new_tensor = X_tensor
            y_c_current = y_c  # renew variables
            j = 0
        else:
            j += 1
            if j > rej_max:  # many consecutive rejects
                k = max(k_min, int(np.ceil(k / 2)))
                j = 0
        X_tensor = rand_tensor(X_new_tensor, k, in_channels, img_size, trainset_size)

    return X_tensor, y_c

However, the prediction probability it generates is so low, like 0.1. Could you please give me some guidance on the Data Synthesis Using the Target Model Algorithm or update the uploaded code? Thanks in advance for your patience!

Best wish!

Yantong

about Chiron

Hello Dr Song. I noticed that you are the author of Chiron. I want to know could I download Chiron and use it for evaluate my research. If so, please let me know the where could get it. Thank you.
By the way, I found your membership privacy inference is a real good job, however, the accuarcy is not high when classes c is not so much, for example, for MINST or Cifar10. I think maybe some improvement could do in the future. What about your opinion about it? and do you have such a plan?
Thank you.

About Target Data

Hello,

Excuse my lack of knowledge but I am failing to run your project.

The code requires a file target_data.npz and I am not sure if I should create that file or whether it should be provided.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.