eladhoffer / utils.pytorch Goto Github PK

View Code? Open in Web Editor NEW

90.0 3.0 25.0 105 KB

Utilities for Pytorch

License: MIT License

Python 100.00%

utils.pytorch's People

Contributors

Stargazers

Watchers

utils.pytorch's Issues

Usage examples

Very nice utils!

It'd be cool if you provided examples in action for some/all of them.

I am particularly interested in the cross entropy with label smoothing.

Thanks! #

Hi , I noticed you changed cross_entropy.py lately to take account for ignore_index, but I think the new version might still be missing a step at the end? I was wondering in cross_entropy.py , when you calculate the smoothed cross entropy as the sum of the ce + kl, averaged over the number of tokens in the batch, shouldn't you subtract the padding tokens from the number of tokens in the denominator before you average?

if reduce:
kl = kl.mean() if size_average else kl.sum()

# for label smoothing with parameter eps:
if onehot_smoothing:
    entropy = -(math.log(1 - smooth_eps) + smooth_eps *
                math.log(smooth_eps / ((num_classes - 1) * (1 - smooth_eps))))
else:
    entropy = -(target * target.log()).sum()

if size_average:
    kl *= num_classes
    entropy /= logits.size(0)

here when you divide by logits.size(0), i think logits.size(0) = batch_size*sequence_length which includes padding_tokens in the total count. Shouldn't it be something like

num_tokens = targets.ne(ignore_idx).sum()
...
kl = kl.sum() / num_tokens
...
entropy /= num_tokens

Maybe with some epsilon introduced to make sure you don't divide by zero? Apologies if im wrong about this.

Several Issues about the quantize.py

Thanks for composing this code, I am using quantize.py to quantize my model.
And I meet the following issues:

model.register_buffer(n, p): string n cannnot include '.', so I replce all the '.' with '_'. then problem solved.
2.q_x.scale * (q_x.tensor.float() - q_x.zero_point): Maybe before quantization, the bytetensor should be converted to float tensor?

ImportError: cannot import name DEFAULT_PALETTE

Traceback (most recent call last):
  File "main.py", line 13, in <module>
    from seq2seq.tools.utils.log import setup_logging
  File "/home/demobin/github/seq2seq.pytorch/seq2seq/tools/utils/log.py", line 9, in <module>
    from bokeh.plotting.helpers import DEFAULT_PALETTE
ImportError: cannot import name DEFAULT_PALETTE

Possible Bug in LabelSmoothing for Binary Cross Entropy

Possible bug in LabelSmoothing for Binary Cross Entropy.

Current Code:

smooth_eps = smooth_eps or 0
    if smooth_eps > 0:
        target = target.float()
        target.add_(smooth_eps).div_(2.)

Shouldn't it be:

smooth_eps = smooth_eps or 0
    if smooth_eps > 0:
        target = target.float()* (1- smooth_eps)
        target = target + (smooth_eps/2)

Question about label smoothing implementation

Hi, I've been going over your implementation of label smoothing for cross-entropy, and I don't understand why, in this code in cross_entropy.py:

        eps_sum = smooth_eps / num_classes
        eps_nll = 1. - eps_sum - smooth_eps
        likelihood = lsm.gather(dim=-1, index=target.unsqueeze(-1)).squeeze(-1)
        loss = -(eps_nll * likelihood + eps_sum * lsm.sum(-1))

you have eps_nll = 1. - eps_sum - smooth_eps instead of just eps_nll = 1. - smooth_eps. Doesn't eps_nll = 1. - eps_sum - smooth_eps introduce an extra term in the loss that shouldn't be there? Going by the paper,

sum_k q(k) log p(k),

is likelihood in the above snippet and

sum_k log p(k),

is lsm.sum(-1). The label-smoothed loss, for uniform u(k), should be

(1-epsilon) sum_k q(k) log p(k) - epsilon/K sum_k log p(k),

so shouldn't it be

loss = -((1 - smooth_eps) * likelihood + eps_sum * lsm.sum(-1))?

tests / instructions

Hi Elad - ma kore.
I am trying to use the smooth cross entropy. I am not sure what exactly is the expected input. Can you add some rough doc or an example, please?
Thanks,
Dan

eladhoffer / utils.pytorch Goto Github PK

utils.pytorch's People

Contributors

Stargazers

Watchers

Forkers

utils.pytorch's Issues

Usage examples

smoothed cross entropy loss

Several Issues about the quantize.py

ImportError: cannot import name DEFAULT_PALETTE

Possible Bug in LabelSmoothing for Binary Cross Entropy

Question about label smoothing implementation

tests / instructions

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent