Coder Social home page Coder Social logo

mikoto10032 / automaticweightedloss Goto Github PK

View Code? Open in Web Editor NEW
534.0 5.0 78.0 9 KB

Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, Auxiliary Tasks in Multi-task Learning

License: Apache License 2.0

Python 100.00%
multi-task multi-task-learning weigh-losses auxiliary-tasks pytorch deep-learning

automaticweightedloss's People

Contributors

mikoto10032 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

automaticweightedloss's Issues

awl的parameter不更新

作者你好:
我有2个任务,都用的交叉熵,训练时候发现awl的参数不更新,请问你碰到过这个问题吗?

On the realization of tensorflow

Hello, is there any specific implementation code on tensorflow, or is there any annotation explanation. We want to implement it on our own model.Thank you.

implementation of the loss function from paper

Hey everyone,

first of all, thanks for your implementation
the following formula from the paper "Auxiliary Tasks in Multi-task Learning"
image

was implemented by:

        loss_sum += 0.5 / (self.params[i] ** 2) * loss + torch.log(1 + self.params[i] ** 2)

but imo this is not exactly true, since 0.5 / sigma^2 is not the same as 1/(2*sigma^2) am I right or do i oversee something here?

Thanks for feedback

Wonderful Work~

I am going to implement it in my work, hope for its effective improvement

学习率设置

我设置了多个优化器去优化不同的任务,awl的学习率和优化器应该怎样选择呢?谢谢博主

您好,我在使用您的代码时碰到了这种问题

TypeError: optimizer can only optimize Tensors, but one of the params is list

我的optimizer代码为

`
def get_optimizer(self):

    lr = opt.lr
    params = []
	# 此处个人理解为传参进入dict有偏置的学习率乘2无权重衰减(也即无惩罚项)
	# 无偏置项的学习率不变,有权重衰减
    for key, value in dict(self.named_parameters()).items():
        if value.requires_grad:
            if 'bias' in key:
                params += [{'params': [value], 'lr': lr * 2, 'weight_decay': 0}]
            else:
                params += [{'params': [value], 'lr': lr, 'weight_decay': opt.weight_decay}]
    if opt.use_adam:
        self.optimizer = t.optim.Adam([params, {'params': self.awl.parameters(), 'weight_decay': 0}])
    else:
        self.optimizer = t.optim.SGD([params, {'params': self.awl.parameters(), 'weight_decay': 0}], momentum=0.9)
    return self.optimizer

`

我想请问这样导入parameters是有什么问题吗?如果有问题的话请问您有什么解决方案吗?

Questions about implementation

Hi, thanks for sharing the code. It's really helpful to me. I have two questions.

  1. In another implementation, the sigma^2 is used as a parameter. I tried both learning sigma and sigma^2. They show close but different performance. Do you think the implementation difference may have some significant impact?

  2. I'm using hinge loss with uncertainty. For some batches, the loss value may be zero. In the case of loss being zero, the params have a chance to become zero or very small. Do you have suggestions on this?

How to set parameter list to multiple optimizers?

People usually use multiple module & optimizer on GAN model, for example:

moduleA = Generator()
moduleB = Discriminator()
moduleC = Predictor()

so the corresponding optimizers are:

optG = optim.Adam(Generator.parameters(), ...)
optD = optim.Adam(Discriminator.parameters(), ...)
optP = optim.Adam(Predictor.parameters(), ...)

For single module, the example show:

model = Model()
optimizer = optim.Adam([
                {'params': model.parameters()},
                {'params': awl.parameters(), 'weight_decay': 0}	
            ])

For the multiple modules above, how to set the parameters in optimizers? I can guess two options but they might be wrong:
option1:

optG = optim.Adam(list(Generator.parameters()), ...)
optD = optim.Adam(list(Discriminator.parameters()), ...)
optP = optim.Adam(list(Predictor.parameters())+list(awl.parameters()), ...)

option2:

optG = optim.Adam(list(Generator.parameters())+list(awl.parameters()), ...)
optD = optim.Adam(list(Discriminator.parameters())+list(awl.parameters()), ...)
optP = optim.Adam(list(Predictor.parameters())+list(awl.parameters()), ...)

@Mikoto10032
Which one is correct?

Thanks!

Why avoid the loss of becoming negative

Thanks for your work and I have a question

Why can't the loss be negative? It seems to me that the value of the loss does not affect the training of the network.

As an example, let's say my loss is the cross-entropy loss, which is (0, 1) most of the time, and the optimization goal is to minimize the loss.

Now suppose I add a constant of -100 to the loss. Loss = loss-100. The loss will be (-100, -99), and the optimization goal remains the same: reduce the loss

The way to reduce the loss is gradient descent. Obviously, the constant -100 does not affect the gradient of the network parameters, that is, the loss does not seem to affect the training process, what is important is the gradient of this value to the network parameters.

Now back to the original question, why is it necessary to avoid negative losses?

How to use it for 3 regression tasks?

Hi awesome work!

I'm doing multitask learning in PyTorch Geometric with this code

image

How do I use AutomaticWeightedLoss for 3 tasks? Is it possible? Thanks, Sam

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.