google-research / simclr Goto Github PK
View Code? Open in Web Editor NEWSimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners
Home Page: https://arxiv.org/abs/2006.10029
License: Apache License 2.0
SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners
Home Page: https://arxiv.org/abs/2006.10029
License: Apache License 2.0
Could you brief the detailed settings for the linear evaluation? Thank you very much!
Thanks a lot!
Hi Ting,
Thanks for sharing all these, it's amazing and elegant. One quick question:
I never used TPU, and I'm very new to TensorFlow. I would like to test your code on GPU(s). The pretraining works, but I get a problem with finetuning.
Shall line 369 in run.py be
sliced_eval_mode = tf.estimator.tpu.InputPipelineConfig.PER_HOST_V1
rather than
sliced_eval_mode = tf.estimator.tpu.InputPipelineConfig.SLICED
to enable GPU usage?
If using SLICED, my terminal told me SLICED could only be used when using TPU(s).
Hi, I am trying to further evaluate learned representations from 'init_conv', 'block_group1' etc.. And I tried loading the pre-trained model using tensorflow_hub with the following code,
module = hub.Module('./checkpoints_ResNet50_1x/ResNet50_1x/hub')
print(module.get_signature_names())
This only gives me the 'default' signature, but no other signatures listed in the README.
Is this the right way to access them?
Thanks!
Hello, thank you for this great work!
I am using simclr for audio (speech) processing, by extracting features as images from the audio clips and then use them to train/fine-tune a model. My first experiments provides me 28% (accuracy) for 150 epochs. Here what I did:
1- Extract MFCCs spectrogram from speech audio
2- Fine tune simclr by training the linear classifier
Is it the right way to use simclr ? Do you think that 28% is a "normal" accuracy? I think it is due to the fact that MFCCs images are very similar that the model can not classify them.
Thanks in advance
Hello, I'm here to disturb you again. When I was training your simclr model, I saw in your readme that if you use the imagenet2012 dataset, mod_dir=None, and when training the ciarf10 dataset, mod_dir=/tmp/simclr_test. Does this mod_dir refer to the model storage location? Or is the model automatically selected when training different data sets?
I am Sayak, an ML Engineer from India. I am writing to let you know that I have started working on a minimal implementation of SimCLR.
I was able to use the utility functions (the augmentation policies, and the NT-XEnt loss) provided in the GitHub repository mentioned in the paper. For my implementation, I am using a combination of tf.keras and custom loops with tf.GradientTape. I am trying my experiments on a small subset of ImageNet, top 5 categories, each class having 250 images. I am unsure at this point if my implementation is buggy but I am not able to get desired results when I am evaluating the self-supervised learned representations using Linear Evaluation.
This is why I am reaching out to know if it'd be possible to take a look at it and let me know about the feedback. I know this is a lot to ask for. But in case, if you would like to have a look, I have attached two notebooks:
I used a GCP VM pre-configured with TensorFlow 2.1 and Tesla T4 GPU.
Hi,
Had a question on the underlying paper itself. Not sure where else I could ask.
Dear Ting,
Thank you so much for providing this great codebase for the community!
When I tried to train simclr on ImageNet using TPU, I got different errors when the code tried to read the ILSVRC2012_img_train.tar
and ILSVRC2012_img_val.tar
(I have downloaded those files). Following docs will trigger different errors. May I gently ask on TPU cloud how did you set up the ImageNet TensorFlow dataset properly given that the data is downloaded?
If you would like me to provide the error information, I will post it :)
Thank you so much!
Thank you for great work. I'm trying to fully understand the section about loss function comparison. In table 2, what is Z(u) in the gradient of NT-Xent?
Sorry nvm, closing this.
(Thanks for sharing the code by the way! :))
Hi @chentingpc,
I was wondering if it is possible to list repositories that implement SimCLR minimally in a friendlier format with Keras or PyTorch. I think a lot of people might benefit from that. WDYT?
Thanks Ting for open-sourcing the code and checkpoints. From the pre-trained model, I have been trying to reproduce the results on transfer learning to datasets mentioned in the paper. Given the challenges there, I wonder whether it will possible to provide some additional instructions on transfer learning, in particular to low-dimensional datasets.
In my attempt, I found the details in the paper enough to reproduce results within 0.5-1% of reported accuracy on most transfer learning dataset and linear evaluation on ImageNet itself. However, on transfer learning to CIFAR-10, the best accuracy on my end was 75% (reported in the paper ~90%). I am using ResNet50-1x as the feature extractor. Unfortunately, both the repo and the paper are quite thin on the details when it comes to transfer learning to low dimensional datasets. These are the design choices I used:
Along these design choices, I have conducted a thorough hyper-parameter search but met no success. I wonder whether you employed different design choices to report results on transfer learning to CIFAR-10/CIFAR-100 dataset?
I have obtained checkpoints after training simCLR V1 with cifar 10 data. Now how can I use these checkpoint to tap the features(representation) after encoder.
Thanks Ting for open-sourcing the code and checkpoints. which method about data normalization in the ptoject? I can't find it
Hello, when I used the single GPU to train the caifr10 data set, the following prompt appeared: The system prompts to skip this training.
The system prompts as follows:
INFO:tensorflow:_TPUContext: eval_on_tpu True
I0627 17:39:39.536073 139766252832576 tpu_context.py:209] _TPUContext: eval_on_tpu True
WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False.
W0627 17:39:39.536876 139766252832576 tpu_context.py:211] eval_on_tpu ignored because use_tpu is False.
INFO:tensorflow:Skipping training since max_steps has already saved.
I0627 17:39:39.541100 139766252832576 estimator.py:360] Skipping training since max_steps has already saved.
INFO:tensorflow:training_loop marked as finished
I0627 17:39:39.541271 139766252832576 error_handling.py:96] training_loop marked as finished
Firstly, many thanks for sharing a very interesting piece of research.
I was just wondering if such a representation learning framework may also prove to be effective for one-class classification (or anomaly / novelty detection) tasks. One very simple setting may involve just training a one-class SVDD (Ruff et al. 2018) on top of a frozen base network (trained on, say, ImageNet). It would be great to hear your views!
Kind regards,
Se
Hi, thank you for the code release!
I encounter the following error when performing linear eval on CIFAR.
Pretraining:
python run.py --train_mode=pretrain --train_batch_size=512 --train_epochs=1000 --learning_rate=1.0 --weight_decay=1e-4 --temperature=0.5 --dataset=cifar10 --image_size=32 --eval_split=test --resnet_depth=18 --use_blur=False --color_jitter_strength=0.5 --model_dir=/mnt/research/results/simclr/simclr_test --use_tpu=False
Linear eval:
python run.py --mode=train_then_eval --train_mode=finetune --fine_tune_after_block=4 --zero_init_logits_layer=True --variable_schema='(?!global_step|(?:.*/|^)Momentum|head)' --global_bn=False --optimizer=momentum --learning_rate=0.1 --weight_decay=0.0 --train_epochs=100 --train_batch_size=512 --warmup_epochs=0 --dataset=cifar10 --image_size=32 --eval_split=test --resnet_depth=18 --checkpoint=/mnt/research/results/simclr/simclr_test --model_dir=/mnt/research/results/simclr/simclr_test_ft --use_tpu=False
I0625 13:45:52.051569 140622183225152 evaluation.py:276] Finished evaluation at 2020-06-25-13:45:52
INFO:tensorflow:Saving dict for global step 9766: contrast_loss = 0.0, contrastive_top_1_accuracy = 1.0, contrastive_top_5_accuracy = 1.0, global_step = 9766, label_top_1_accuracy = 0.8248, label_top_5_accuracy = 0.9829, loss = 0.5490037, regularization_loss = 0.0
I0625 13:45:52.051712 140622183225152 estimator.py:2053] Saving dict for global step 9766: contrast_loss = 0.0, contrastive_top_1_accuracy = 1.0, contrastive_top_5_accuracy = 1.0, global_step = 9766, label_top_1_accuracy = 0.8248, label_top_5_accuracy = 0.9829, loss = 0.5490037, regularization_loss = 0.0
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 9766: /mnt/research/results/simclr/simclr_test_ft/model.ckpt-9766
I0625 13:45:52.182560 140622183225152 estimator.py:2113] Saving 'checkpoint_path' summary for global step 9766: /mnt/research/results/simclr/simclr_test_ft/model.ckpt-9766
INFO:tensorflow:evaluation_loop marked as finished
I0625 13:45:52.182964 140622183225152 error_handling.py:108] evaluation_loop marked as finished
WARNING:tensorflow:From /home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/tensorflow_hub/saved_model_lib.py:110: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
W0625 13:45:52.510126 140622183225152 deprecation.py:323] From /home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/tensorflow_hub/saved_model_lib.py:110: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
Traceback (most recent call last):
File "run.py", line 435, in <module>
app.run(main)
File "/home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "run.py", line 430, in main
num_classes=num_classes)
File "run.py", line 343, in perform_evaluation
checkpoint_path=checkpoint_path)
File "run.py", line 293, in build_hub_module
name_transform_fn=None)
File "/home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/tensorflow_hub/module_spec.py", line 80, in export
export_module_spec(self, path, checkpoint_path, name_transform_fn)
File "/home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/tensorflow_hub/module.py", line 74, in export_module_spec
tf_v1.train.init_from_checkpoint(checkpoint_path, assign_map)
File "/home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/training/checkpoint_utils.py", line 291, in init_from_checkpoint
init_from_checkpoint_fn)
File "/home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 1949, in merge_call
return self._merge_call(merge_fn, args, kwargs)
File "/home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 1956, in _merge_call
return merge_fn(self._strategy, *args, **kwargs)
File "/home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/training/checkpoint_utils.py", line 286, in <lambda>
ckpt_dir_or_file, assignment_map)
File "/home/mren/miniconda3/envs/tf/lib/python3.6/site-packages/tensorflow_core/python/training/checkpoint_utils.py", line 329, in _init_from_checkpoint
tensor_name_in_ckpt, str(variable_map[tensor_name_in_ckpt])
ValueError: Shape of variable module/head_supervised/linear_layer/dense/kernel:0 ((512, 10)) doesn't match with shape of tensor head_supervised/linear_layer/dense/kernel ([128, 10]) from checkpoint reader.
Hi -- I was wondering if it would be possible to include the projection head (for the SimCLR objective) in the model checkpoints? The current checkpoints only seem to include the supervised projection head ("head_supervised") but not the unsupervised head ("head_contrastive")
I have checkpoints of SimCLR V1
I want to input an image and I want to obtain features representation from encoder (2048D) as output.
Can you please guide me regarding this
I figured out , my output tensor will be 'base_model/final_avg_pool:0',
But couldn't find the input tensor name. It was quite complicated to understand from graph.
Why did you only do dot product when calculating cosine similarity, but not divided by the magnitude of the vector?As follows:
logits_aa = tf.matmul(hidden1, hidden1_large, transpose_b=True) / temperature
Has someone did it? How is the performance on NLP task?
Thank you!
Hi,
I am trying to replicate the numbers reported in Appendix B.9, but stuck around 91~92%.
The training recipe in README gives me around 91% accuracy as stated.
At least increasing depth (ResNet18 -> 50) did not show a meaningful performance gain (~0.5%), and I believe this is consistent with the behavior in the supervised learning on CIFAR-10, which is reported in https://github.com/kuangliu/pytorch-cifar.
If I understand correctly, I should be able to see 93~94% accuracy after 1k epochs with any batch size as reported in Figure B.4, and specifically 94.0% when the batch size is 1024.
Could you share the exact training recipe to replicate the performances?
Thanks.
Thank you for your impressive work.
I'm reading your paper and wondering the architecture of SimCLR's classifier.
From my understanding, you used an l2-regularized multinomial logistic regression classifier for linear evaluation with frozen SimCLR. But, which classifier is used for fine-tuning? Did you use l2-regularized multinomial logistic regression classifier or something like non-linear MLP for fine-tuning?
Hope to hear from you soon.
Thanks in advance!
Hello,
I'm having some difficulty following the logic of the function call for add_contrastive_loss() defined in objectives.py. It appears that it is being passed two latent vectors concatenated into one latent vector.
(1) Are the latent vectors representing the latent vectors generated by ResNet for the N input images and the N augmented input images?
(2) If so, are both of these latent vectors aligned? i.e. reference the code; is element i of hidden1 and element i for hidden2 referencing augmented versions of the same image?
And if so, (3) Does the add_contrastive_loss code take care of the negative samples? At which line are negative samples handled?
Thanks!
Or we need to label smaller image dataset?
Thank you very much!
Lines 487 to 490 in f3ca72f
So there are two types of padding in this repo, which may cause confusion when users want to use simclr model in another codebase as the weights are neither compatible with the vanilla tensorflow nor other frameworks like pytorch or MXNet.
Thanks Ting for open-sourcing the code and checkpoints,How to object detection model(mobilenetv2_ssd or other) with simclr,are there any examples?
I have more questions about semi-supervised learning via fine-tuning(Zhai et al. (2019)).
When fine-tuning, the network is updated using the following Loss
Here are some questions.
In semi-supervised learning, is the loss used for unlabeled dataset the same as the contrastive loss used in SIMCLR-based pretraining? (Augmentation-> encoder-> MLP-> contrastive loss)
or
Is it loss such as Loss_rot used in Zhai et al (2019)?
When fine-tuning with semi-supervised learning, do you learn after replacing g (.) (MLP) with a head that acts as a classifier? Or is there a layer for classification independent of g (.)?
At this time, is it correct to fine-tun both the encoder and the changed head (ex FC layer)?
Finally, I wonder about backpropagation from contrastive loss of unlabeled data and how it can be fine-tuned not only to encoder but also to FC for classification.
i use cifar10 to train simclrv2 model and depth set 18 , when i trained 1000 epoch ,i finetune my model ,i get 84% top-1 ,its loss in simclrv1 91% top-1 ,it correct?
I want to reproduce the experiment on cifar10 and use the command here with the flag "--use_tpu=False".
I got the error that is "ValueError: Invalid TPUconfig eval_training_input_configuration
value. SLICED mode only works on use_tpu=True".
The is config SLICED mode is here
Is this a problem for others?
I use the tf1.15 same as here
May I ask what's the expected "contrast_acc" (line 128 of model.py) on CIFAR10? Thanks!
Would it be possible to release pretrained models for the data augmentation ablation experiments described in the paper?
Could it possible to also release the checkpoints for the supervised networks (especially ResNet-4x), which are used as a baseline throughout the paper? Except for ResNet50, the checkpoints for larger width nets are unavailable at tensorhub or any other pre-trained network sources.
In the paper the explanation on temperature is l2 normalization along with temperature effectively weights different examples, and an appropriate temperature can help the model learn from hard negatives. But I can't really understand how adding temperature achieves 'learning from hard negatives'... Also, looking at Table 5, it seems like choosing 0.1 is significantly better than choosing temperature larger/ smaller than 0.1. I'm also wondering how that happened...
Hello,
I would like to know the logic behind the use of the mask on the 'add_contrastive_loss' function:
It is my understanding that you use it in order to decrease the value of the similarity for s_{i,i} when computing the loss.
What you do is the following:
LARGE_NUM = 1e9
masks = masks = tf.one_hot(tf.range(batch_size), batch_size)
logits_aa = tf.matmul(hidden1, hidden1_large, transpose_b=True) / temperature
logits_aa = logits_aa - masks * LARGE_NUM_
I know that the final value for the logits of (i,i) will be a very large negative number. But wouldnt it be more precise to just set the logits of positions (i,i) to zero ? I was thinking of the following code:
logits_aa = tf.matmul(hidden1, hidden1_large, transpose_b=True) / temperature
mask_inv = tf.one_hot(tf.range(batch_size), batch_size , on_value=0, off_value=1)
logits_aa = logits_aa * mask_inv
Thanks in advance
How to set up your own data set for training,such as jpg data set?
Hi.
How did you choose 1% imagenet data? To do the few-shot learning/semi-supervised learning, choosing labeled images are important. and I'd like to ask how you chose the images out of the whole imagenet dataset.
Thanks.
Hey there, thanks for making this available, much appreciated!
Quick question: running pip install -r requirements.txt
results in the following error: Could not find a version that satisfies the requirement tensorflow==1.15
. Any ideas on what I should be doing differently?
Hello,
Thanks so much for making this code available! I am trying to use the pretrained SimCLR models to extract features from my own custom dataset. I have done this successfully with other models trained using standard methods ("successfully" meaning that the extracted features are effective at classifying my custom dataset). With SimCLR however the extracted features seem to be only slightly better than random. I am wondering if this is because the input statistics to SimCLR are different from those in most ImageNet-trained models. I normalize my data to the Imagenet mean/std of mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]. Should I be using different normalization for SimCLR?
First of all thanks a lot for releasing this! Its a huge contribution for the whole industry.
I've explored the pretrained model you released and got it working, I have some small issues that could improve its usage:
serving_default
signature which I think its the standard.inputs
are (batch_size, 224, 224, 3)
, the default
output tensor is the embeddings, and *logits*
output tensor is the output when training, don't know about the other outputs.Hello, thank you for open sourcing this interesting work.
As asked in the title, could you share the train/validation split ratio you used when tuning the hyper-parameters for the CIFAR-10 experiments?
On evaluating the released checkpoints, I observe that the accuracy is a bit below the reported numbers (around 1% less). For example, with ResNet50_1x, the accuracy I got 68.2 whereas the reported number is 69.1. For ResNet50_2x, the resulted accuracy is 73.4 compared to reported 74.2. The evaluation strategy if following:
I wonder whether you use multiple random-crop or any other evaluation strategy in your results. Or whether the released checkpoints differ from the ones used in results for Table 6 in the paper?
Thanks.
I get an error when I run this command:
python run.py --mode=train_then_eval --train_mode=finetune
--fine_tune_after_block=4 --zero_init_logits_layer=True
--variable_schema='(?!global_step|(?:.*/|^)Momentum|head)'
--global_bn=False --optimizer=momentum --learning_rate=0.1 --weight_decay=0.0
--train_epochs=2 --train_batch_size=512 --warmup_epochs=0
--dataset=cifar10 --image_size=32 --eval_split=test --resnet_depth=18
--checkpoint=/tmp/simclr_test --model_dir=/tmp/simclr_test_ft --use_tpu=False
(Note: I changed train_epochs to 2 so that it would run quickly.)
The error is:
ValueError: Shape of variable module/head_supervised/linear_layer/dense/kernel:0 ((512, 10)) doesn't match with shape of tensor head_supervised/linear_layer/dense/kernel ([128, 10]) from checkpoint reader.
I am using TensorFlow 1.15.2. While I was trying to figure out this error, I downloaded your old code prior to the update 5 days ago and found that it runs well. Could you please look into this issue? Thank you very much!
https://github.com/google-research/simclr/blob/master/model_util.py#L114
For the ResNet 1x with 128 projection dimensionality, may I ask if you are projecting from 2048=>2048=>128 in your paper?
Thank you!
hi @chentingpc can you please post the instructions/ guidelines when the code is used on a custom dataset. Any tips for the specific usage of code would also be helpful. Thanks.
Hi
Thanks for releasing the code and the weights. Just a quick question, are the weight released here before-finetune or after-finetune?
Best,
Andrew
Dear Ting,
Thank you for sharing and documenting your code base for https://arxiv.org/pdf/2002.05709.pdf
I am interested in an analysis of the data you created for the results in Figure 5 of your paper. Any chance you have the script for this experiment available? I would like sample a few views from each augmentation composition you explored in that table.
All the best,
Luis
Hi,
I am following the instruction in Readme. And I used the command of fine tuning CIFAR10, while there is an error accured,saying
"ValueError: Shape of variable base_model/batch_normalization_1/beta:0 ((64,)) doesn't match with shape of tensor base_model/batch_normalization_1/beta ([256]) from checkpoint reader."
Does anyone have idea about this?
Thanks a lot!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.