Coder Social home page Coder Social logo

unlearning-challenge / starting-kit Goto Github PK

View Code? Open in Web Editor NEW
373.0 21.0 134.0 2.5 MB

Starting kit for the NeurIPS 2023 unlearning challenge

Home Page: https://unlearning-challenge.github.io/

License: Apache License 2.0

Jupyter Notebook 100.00%

starting-kit's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

starting-kit's Issues

Consideration of network architecture and learning algorithms for unlearning effectiveness

The unlearning challenge could benefit from accounting for the impact of network architecture and training methods on unlearning performance.

Some neural network architectures and training methods, like recursive cortical networks and gated linear networks/supermasks, could have a huge impact on the way the competition is run and how models are evaluated.

It would be nice if future iterations of the challenge could consider:

  1. Incorporating architectures/learning algorithms like recursive cortical networks and gated linear networks/supermasks in the starter kit
  2. Evaluating submissions based on both unlearning performance and the network architecture/learning algorithm used

This could help identify approaches that balance performance and adaptability - crucial for building AI systems that can responsibly adjust to new requirements over time. Studying how architecture and learning algorithms impact unlearnability could drive progress.

Please let me know if you would like me to expand on any part of this feedback or provide more suggestions. I'm happy to discuss ways to improve future iterations of this valuable challenge.

What is the reason not to compare with MIA with "Retrained" model?

Hello I notice the baseline simple MI diagram looks like a "decrease of MIA from pre-trained model to fine-tuned model", which is different from what we consider privacy baseline, where the goal is -- or I think should be -- the MIA of retrained model that does not include the offending data.

Image below:

image

Am I confused, or is this a developmental area?

Thanks!

Suggestion: Avoid Repeated Download of Weights

I suggest changing In [41] to

# download pre-trained weights
import os
path="weights_resnet18_cifar10.pth"
if not os.path.exists(path):
    response = requests.get(
        "https://unlearning-challenge.s3.eu-west-1.amazonaws.com/weights_resnet18_cifar10.pth"
    )
    open(path, "wb").write(response.content)

weights_pretrained = torch.load("weights_resnet18_cifar10.pth", map_location=DEVICE)

# load model with pre-trained weights
model = resnet18(weights=None, num_classes=10)
model.load_state_dict(weights_pretrained)
model.to(DEVICE)
model.eval();

Reasoning: Researchers will likely be running this code repeatedly, and the above just checks if the model is already downloaded before downloading it.

Question Regarding Optimal MIA and Overall Desired Objective

What is considered optimal for the MIA score?

Obviously, it should be lower than the initial model. But just wanted to clarify, are we aiming to have the last chart with as much overlap as possible between the Test and Forget set and a high overall score on the test set, or would an MIA score of less than 0.5 be ideal(and, yes, I can get this significantly lower than 0.5)?

Just trying to clarify the metrics which will be considered an "improved" result.

Face Synthetics labels

The challenge mentions that we will be using the Face Synthetics dataset on an age prediction task. However, Face Synthetics does not have any labels for age. I am aware that we will get a notebook with this dataset and a trained model sometime later this month but I want to understand how the input model is trained.
Side note: any updates on the next notebook?
Thanks

suggested fix to comment on losses histogram of the pre-trained model.

The comment after the plot of histogram of losses on train vs test set of the pre-trained model (first plot)
says:
"As per the above plot, the distributions of losses are quite different between the forget and retain set. This suggests that the simple MIA that we're considering should be reasonably effective."

I believe this should be changed to:
"As per the above plot, the distributions of losses are quite different between the train and test set. This suggests that the simple MIA that we're considering should be reasonably effective."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.