Coder Social home page Coder Social logo

artist-group-factors's People

Contributors

falkaer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

artist-group-factors's Issues

Clarification: GradNorm implementation

Dear authors,
thanks for your great work! I am going trough your implementation of gradnorm. I am reproducing your code for another task (optical flow estimation).

Is there any workaround to avoid using retain graph = True when doing backward?
Why do we also need to retain the graph when calculating the partial "derivative?"

    # compute and retain gradients
    total_weighted_loss.backward(retain_graph=True)
    
    # GRADNORM - learn the weights for each tasks gradients
    
    # zero the w_i(t) gradients since we want to update the weights using gradnorm loss
    self.weights.grad = 0.0 * self.weights.grad
    
    W = list(self.model.mtn.shared_block.parameters())
    norms = []
    
    for w_i, L_i in zip(self.weights, task_losses):
        # gradient of L_i(t) w.r.t. W
        gLgW = torch.autograd.grad(L_i, W, retain_graph=True)
        
        # G^{(i)}_W(t)
        norms.append(torch.norm(w_i * gLgW[0]))
    
    norms = torch.stack(norms)

this leads to an out of memory issue which I am not able to avoid, did you face a similar problem?

Thanks,
Stefano

GradNorm retain_graph OOM

Hi author
I refer to your code which include GradNorm part, and rewrite for my own transformer based model training.
Everything is good, but when the iteration growth up, the error CUDA out of memory. will occur
I would like to know if you have encountered the same error in your training stage?
I thought that is because of retain_graph

loss.backward(retain_graph=True)
and
gygw = torch.autograd.grad(task_losses[k], W.parameters(), retain_graph=True)

Am I right?
And is there any method to avoid this error when iteration growth up?

Thank you for your nice code :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.