Comments (2)
Hi, we are cleaning the code for LM. It will be online soon.
You can also change:
for hv, vi in zip(hvs, v):
param_size = hv.size()
if len(param_size) <= 1: # for Bias and LN
tmp_output = torch.abs( hv * vi) + 0.
hutchinson_trace.append( tmp_output )
elif len(param_size) == 2: # Matrix
tmp_output1 = torch.abs((hv * vi + 0.)).view(-1, self.block_length) # faltten to the N times self.block_length
tmp_output2 = torch.abs(torch.sum(tmp_output1, dim=[1])).view(-1) / float(self.block_length)
tmp_output3 = tmp_output2.repeat_interleave(self.block_length).view(param_size)
hutchinson_trace.append(tmp_output3)
in https://github.com/amirgholami/adahessian/blob/master/transformer/fairseq/optim/adahessian.py to support 3D kernels by yourself (for example the 1D conv for character level LM).
from adahessian.
Hi, we add an instruction to support different type of kernel size and close this issue since it is covered there.
from adahessian.
Related Issues (20)
- AdaHessian in tensorflow 1 version
- Alpha unused HOT 1
- Optimizer is not respecting "trainable" attribute of variables.
- Replace numpy power by TF pow HOT 1
- Help using adahessian in TensorFlow HOT 3
- Error using adahessian in PyTorch HOT 3
- About how to group my params
- Use of AdaHessian with batched training data? HOT 2
- Reasonable learning rate range for adahessian?
- Use of FP16 in backward with create_graph = True?
- Is Hutch++ applicable to improve AdaHessian? HOT 1
- Scalability Question HOT 1
- Inconsistence between paper and training scripts on NMT tasks
- Images
- Object Detection HOT 1
- Possible to use with PyTorch Lightning? HOT 1
- Pre-trained model not available anymore (google drive link expired)
- Can this deal with complex numbers?
- Performance issue about tf.function HOT 1
- I get this error when I use the AdaHessian. Is it a bug?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from adahessian.