falkaer / artist-group-factors Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of Artist Group Factors (arXiv:1805.02043) and GradNorm (arXiv:1711.02257)
PyTorch implementation of Artist Group Factors (arXiv:1805.02043) and GradNorm (arXiv:1711.02257)
Dear authors,
thanks for your great work! I am going trough your implementation of gradnorm. I am reproducing your code for another task (optical flow estimation).
Is there any workaround to avoid using retain graph = True when doing backward?
Why do we also need to retain the graph when calculating the partial "derivative?"
# compute and retain gradients
total_weighted_loss.backward(retain_graph=True)
# GRADNORM - learn the weights for each tasks gradients
# zero the w_i(t) gradients since we want to update the weights using gradnorm loss
self.weights.grad = 0.0 * self.weights.grad
W = list(self.model.mtn.shared_block.parameters())
norms = []
for w_i, L_i in zip(self.weights, task_losses):
# gradient of L_i(t) w.r.t. W
gLgW = torch.autograd.grad(L_i, W, retain_graph=True)
# G^{(i)}_W(t)
norms.append(torch.norm(w_i * gLgW[0]))
norms = torch.stack(norms)
this leads to an out of memory issue which I am not able to avoid, did you face a similar problem?
Thanks,
Stefano
Hi author
I refer to your code which include GradNorm part, and rewrite for my own transformer based model training.
Everything is good, but when the iteration growth up, the error CUDA out of memory.
will occur
I would like to know if you have encountered the same error in your training stage?
I thought that is because of retain_graph
loss.backward(retain_graph=True)
and
gygw = torch.autograd.grad(task_losses[k], W.parameters(), retain_graph=True)
Am I right?
And is there any method to avoid this error when iteration growth up?
Thank you for your nice code :)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.