Comments (3)
I encountered the same error, have u fixed it yet?
from annotated-transformer.
apparently the problem lies in the dimension check of the mask in class LabelSmoothing(nn.Module):
mask.dim() returns 1 for empty tensors; for a quick fix change if mask.dim() > 0: to if mask.dim() > 1:
Edit: while I am at it: to fix the deprecation warning change
self.criterion = nn.KLDivLoss(size_average=False)
to
self.criterion = nn.KLDivLoss(reduction = 'none')
the plot then throws an exception, change
plt.plot(np.arange(1, 100), [loss(x) for x in range(1, 100)])
to
plt.plot(np.arange(1, 100), [torch.mean(loss(x)) for x in range(1, 100)])
to fix that
from annotated-transformer.
fixed some more deprecated errors etc, working up to copy task,
The Annotated Transformer.zip
also fixed the GPU issues:
The Annotated Transformer.zip
also attention vis fixed:
The Annotated TransformerV02.zip
from annotated-transformer.
Related Issues (20)
- Some doubts about SublayerConnection HOT 5
- How long is the training process? HOT 3
- TypeError: dropout(): argument 'input' (position 1) must be Tensor, not NoneType HOT 1
- nbatches vs batch_size
- Visualization issue
- No need for a generator in the EncoderDecoder class HOT 1
- How to do the inference?
- How to calculate the BLEU score? HOT 1
- label smoothing inf err HOT 2
- Incorrect implementaion of SublayerConnection class HOT 2
- use the greedy_decode two times in check_outputs function
- Issue with Spacy Dependency Version: issubclass() arg 1 must be a class HOT 4
- dockerfile HOT 1
- annotated-transformer
- The first column of synthetic data in the first example should be set to 0 instead of 1?
- note on torch 1.11 vs torch 2.1 compatibility
- About transpose processing in `MultiHeadedAttention` class. HOT 2
- Could not get the file at http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz. [RequestException] None. HOT 7
- Epoch Training: Help HOT 1
- Typo in Multihead-attention: HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from annotated-transformer.