eladhoffer / utils.pytorch Goto Github PK
View Code? Open in Web Editor NEWUtilities for Pytorch
License: MIT License
Utilities for Pytorch
License: MIT License
Very nice utils!
It'd be cool if you provided examples in action for some/all of them.
I am particularly interested in the cross entropy with label smoothing.
Thanks! #
Hi , I noticed you changed cross_entropy.py lately to take account for ignore_index, but I think the new version might still be missing a step at the end? I was wondering in cross_entropy.py , when you calculate the smoothed cross entropy as the sum of the ce + kl, averaged over the number of tokens in the batch, shouldn't you subtract the padding tokens from the number of tokens in the denominator before you average?
if reduce:
kl = kl.mean() if size_average else kl.sum()
# for label smoothing with parameter eps:
if onehot_smoothing:
entropy = -(math.log(1 - smooth_eps) + smooth_eps *
math.log(smooth_eps / ((num_classes - 1) * (1 - smooth_eps))))
else:
entropy = -(target * target.log()).sum()
if size_average:
kl *= num_classes
entropy /= logits.size(0)
here when you divide by logits.size(0), i think logits.size(0) = batch_size*sequence_length which includes padding_tokens in the total count. Shouldn't it be something like
num_tokens = targets.ne(ignore_idx).sum()
...
kl = kl.sum() / num_tokens
...
entropy /= num_tokens
Maybe with some epsilon introduced to make sure you don't divide by zero? Apologies if im wrong about this.
Thanks for composing this code, I am using quantize.py to quantize my model.
And I meet the following issues:
Traceback (most recent call last):
File "main.py", line 13, in <module>
from seq2seq.tools.utils.log import setup_logging
File "/home/demobin/github/seq2seq.pytorch/seq2seq/tools/utils/log.py", line 9, in <module>
from bokeh.plotting.helpers import DEFAULT_PALETTE
ImportError: cannot import name DEFAULT_PALETTE
Possible bug in LabelSmoothing for Binary Cross Entropy.
Current Code:
smooth_eps = smooth_eps or 0
if smooth_eps > 0:
target = target.float()
target.add_(smooth_eps).div_(2.)
Shouldn't it be:
smooth_eps = smooth_eps or 0
if smooth_eps > 0:
target = target.float()* (1- smooth_eps)
target = target + (smooth_eps/2)
Hi, I've been going over your implementation of label smoothing for cross-entropy, and I don't understand why, in this code in cross_entropy.py:
eps_sum = smooth_eps / num_classes
eps_nll = 1. - eps_sum - smooth_eps
likelihood = lsm.gather(dim=-1, index=target.unsqueeze(-1)).squeeze(-1)
loss = -(eps_nll * likelihood + eps_sum * lsm.sum(-1))
you have eps_nll = 1. - eps_sum - smooth_eps
instead of just eps_nll = 1. - smooth_eps
. Doesn't eps_nll = 1. - eps_sum - smooth_eps
introduce an extra term in the loss that shouldn't be there? Going by the paper,
sum_k q(k) log p(k),
is likelihood
in the above snippet and
sum_k log p(k),
is lsm.sum(-1)
. The label-smoothed loss, for uniform u(k), should be
so shouldn't it be
loss = -((1 - smooth_eps) * likelihood + eps_sum * lsm.sum(-1))
?
Hi Elad - ma kore.
I am trying to use the smooth cross entropy. I am not sure what exactly is the expected input. Can you add some rough doc or an example, please?
Thanks,
Dan
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.