gt-ripl / l2c Goto Github PK
View Code? Open in Web Editor NEWLearning to Cluster. A deep clustering strategy.
License: MIT License
Learning to Cluster. A deep clustering strategy.
License: MIT License
Hi, i have two questions about the function G:
Can you please tell me how you converted [sij=0]P(Yi|xi,theta)PYj|xj,theta) to (1-sij)log(1-f(xi,theta)Tf(fxj,theta))
It seems like P(Yi|xi,theta)PYj|xj,theta) ->1-f(xi,theta)Tf(fxj,theta) for no reason.
It conflicts with the theory when 2 classes are differents. Your loss function has a positive class =1 when 2 samples have the same predictions. It means the higher overlap, the loss should be lower. In your case, the lower of overlap, the loss is lower.
Are you going to add the implementation for your instance segmentation approach to this repo, or can I find it somewhere else?
What kind of Upsampling have you used?
Thanks for your excellent works, but could you please share your similarity model for ImageNet?
In my experiments, the two-linear-SimilarityNet converges quickly on Omniglot, but very slowly on large scale datasets.
For example, the F1-score of "similar" for val and test categories saturate at around 50% after 1000 epochs:
The confusion matrixes for train set, val set, and test set are:
Train Matrix:
[ Dissimilarity : 525795 95205 (621000 in all.)]
[ Similarity : 1728 67272 ( 69000 in all.)]
Dissimilarity : PR: 99.7%, RR: 84.7%, F1: 91.6%.
Similarity : PR: 41.4%, RR: 97.5%, F1: 58.1%.
Val Matrix:
[ Dissimilarity : 207682 26318 (234000 in all.)]
[ Similarity : 5590 20410 ( 26000 in all.)]
Dissimilarity : PR: 97.4%, RR: 88.8%, F1: 92.9%.
Similarity : PR: 43.7%, RR: 78.5%, F1: 56.1%.
Test Matrix:
[ Dissimilarity : 80454 9546 ( 90000 in all.)]
[ Similarity : 3535 6465 ( 10000 in all.)]
Dissimilarity : PR: 95.8%, RR: 89.4%, F1: 92.5%.
Similarity : PR: 40.4%, RR: 64.6%, F1: 49.7%.
Therefore, it may need a more powerful model to fit the similarity on large scale datasets.
And it would be helpful if you share your SimilarityNet for ImageNet or some tricks to obtain a good SimilarityNet on large scale dataset.
Thank you.
Hi, I have question regarding training function G. by using pair-enumeration layer there are always many more dis-similar pair per batch than similar pairs, I can see that it works on Omniglot but on more complicated datasets like imagenet, would not this imbalance be a problem? is different hyperparameters used to train G for imagenet?
Hello there
Have you also implemented BatchKLdivCriterion.lua in the original code in this repo ?
In the ICLR2019 paper, the paragraph below Eq. 1, you said that marginalizing Y is intractable and Y_i depends on each other, so the additional independence is introduced.
Why the computation is intractable and how does Y_i depend on each other?
Eq. 2 is a bit intuitive that I cannot figure out why it is an approximation.
Hi, running python demo.py
gives me the following error. any idea on how to fix that?
==== Epoch:0 ====
/projects/anaconda3/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:82: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule.See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
"https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
LR: 0.001
Itr |Batch time |Data Time |Loss
Traceback (most recent call last):
File "demo.py", line 295, in <module>
run(get_args(sys.argv[1:]))
File "demo.py", line 236, in run
train(epoch, train_loader, learner, args)
File "demo.py", line 76, in train
confusion.add(output, eval_target)
File "/projects/unsupervised/05_test_l2c/L2C/utils/metric.py", line 69, in add
output = output.squeeze_()
RuntimeError: set_storage_offset is not allowed on a Tensor created from .data or .detach().
If your intent is to change the metadata of a Tensor (such as sizes / strides / storage / storage_offset)
without autograd tracking the change, remove the .data / .detach() call and wrap the change in a `with torch.no_grad():` block.
For example, change:
x.data.set_(y)
to:
with torch.no_grad():
x.set_(y)
> pip freeze | grep torch
torch==1.2.0
torchvision==0.4.0a0+6b959ee
I see you've mentioned your 2019 Arxiv paper in your readme, but can you also add a link to it? The link is https://arxiv.org/abs/1901.00544 . Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.