Comments (5)
@xvdp Are you able to run the rest of it? I am getting two errors which I am not able to resolve:
- RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'
.
.
.
<ipython-input-139-2ae4ba63671c> in encode(self, src, src_mask)
17
18 def encode(self, src, src_mask):
---> 19 return self.encoder(self.src_embed(src), src_mask)
20
21 def decode(self, memory, src_mask, tgt, tgt_mask):
.
.
.
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'
- NotImplementedError
.
.
.
74 def encode(self, src, src_mask):
---> 75 return self.encoder(self.src_embed(src), src_mask)
Both has self.encoder() as fault. I can't really figure out what is happening. It will super great if you can provide some insight on this. Link to that jupyter notebook:
https://github.com/AmoghM/DeepLearning/blob/master/TransformerNetwork/HarvardTransformer.ipynb
from annotated-transformer.
@xvdp
Solved
"exp" not implemented for 'torch.LongTensor' pytorch 1.0
by the original version:
div_term = 1 / (10000 ** (torch.arange(0., d_model, 2) / d_model))
I am not sure though what is the main reason why they decided to use .exp
but my best guess is the numerical stability.
from annotated-transformer.
Glad you solved it, sorry - i hadn't seen your message.
There are are a couple other things that need to be fixed so this runs on pytorch 1.0 +. As in the access to scalars using .item() instead of data()[0]
but curiously I did not run into your problem.
Ill note it here.
from annotated-transformer.
@AmoghM
out = greedy_decode (model, src.cuda(), src_mask.cuda(), max_len=60, start_symbol=TGT.vocab.stoi["<s>"])
try to add .cuda()
at the end of src and src_mask, this will move src and src_mask to gpus
I find the answer in the below link
https://github.com/huggingface/pytorch-pretrained-BERT/issues/227
from annotated-transformer.
@V-Enzo Thanks for pointing at this. I will try to do it and report back.
from annotated-transformer.
Related Issues (20)
- Some doubts about SublayerConnection HOT 5
- How long is the training process? HOT 3
- TypeError: dropout(): argument 'input' (position 1) must be Tensor, not NoneType HOT 1
- nbatches vs batch_size
- Visualization issue
- No need for a generator in the EncoderDecoder class HOT 1
- How to do the inference?
- How to calculate the BLEU score? HOT 1
- label smoothing inf err HOT 2
- Incorrect implementaion of SublayerConnection class HOT 2
- use the greedy_decode two times in check_outputs function
- Issue with Spacy Dependency Version: issubclass() arg 1 must be a class HOT 4
- dockerfile HOT 1
- annotated-transformer
- The first column of synthetic data in the first example should be set to 0 instead of 1?
- note on torch 1.11 vs torch 2.1 compatibility
- About transpose processing in `MultiHeadedAttention` class. HOT 2
- Could not get the file at http://www.quest.dcs.shef.ac.uk/wmt16_files_mmt/training.tar.gz. [RequestException] None. HOT 7
- Epoch Training: Help HOT 1
- Typo in Multihead-attention: HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from annotated-transformer.