unlearning-challenge / starting-kit Goto Github PK
View Code? Open in Web Editor NEWStarting kit for the NeurIPS 2023 unlearning challenge
Home Page: https://unlearning-challenge.github.io/
License: Apache License 2.0
Starting kit for the NeurIPS 2023 unlearning challenge
Home Page: https://unlearning-challenge.github.io/
License: Apache License 2.0
The unlearning challenge could benefit from accounting for the impact of network architecture and training methods on unlearning performance.
Some neural network architectures and training methods, like recursive cortical networks and gated linear networks/supermasks, could have a huge impact on the way the competition is run and how models are evaluated.
It would be nice if future iterations of the challenge could consider:
This could help identify approaches that balance performance and adaptability - crucial for building AI systems that can responsibly adjust to new requirements over time. Studying how architecture and learning algorithms impact unlearnability could drive progress.
Please let me know if you would like me to expand on any part of this feedback or provide more suggestions. I'm happy to discuss ways to improve future iterations of this valuable challenge.
Hello I notice the baseline simple MI diagram looks like a "decrease of MIA from pre-trained model to fine-tuned model", which is different from what we consider privacy baseline, where the goal is -- or I think should be -- the MIA of retrained model that does not include the offending data.
Image below:
Am I confused, or is this a developmental area?
Thanks!
I suggest changing In [41] to
# download pre-trained weights
import os
path="weights_resnet18_cifar10.pth"
if not os.path.exists(path):
response = requests.get(
"https://unlearning-challenge.s3.eu-west-1.amazonaws.com/weights_resnet18_cifar10.pth"
)
open(path, "wb").write(response.content)
weights_pretrained = torch.load("weights_resnet18_cifar10.pth", map_location=DEVICE)
# load model with pre-trained weights
model = resnet18(weights=None, num_classes=10)
model.load_state_dict(weights_pretrained)
model.to(DEVICE)
model.eval();
Reasoning: Researchers will likely be running this code repeatedly, and the above just checks if the model is already downloaded before downloading it.
What is considered optimal for the MIA score?
Obviously, it should be lower than the initial model. But just wanted to clarify, are we aiming to have the last chart with as much overlap as possible between the Test and Forget set and a high overall score on the test set, or would an MIA score of less than 0.5 be ideal(and, yes, I can get this significantly lower than 0.5)?
Just trying to clarify the metrics which will be considered an "improved" result.
The challenge mentions that we will be using the Face Synthetics dataset on an age prediction task. However, Face Synthetics does not have any labels for age. I am aware that we will get a notebook with this dataset and a trained model sometime later this month but I want to understand how the input model is trained.
Side note: any updates on the next notebook?
Thanks
Is the evaluation metric public?
Please share how the evaluation metric is computed
The comment after the plot of histogram of losses on train vs test set of the pre-trained model (first plot)
says:
"As per the above plot, the distributions of losses are quite different between the forget and retain set. This suggests that the simple MIA that we're considering should be reasonably effective."
I believe this should be changed to:
"As per the above plot, the distributions of losses are quite different between the train and test set. This suggests that the simple MIA that we're considering should be reasonably effective."
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.