jadore801120 / attention-is-all-you-need-pytorch Goto Github PK
View Code? Open in Web Editor NEWA PyTorch implementation of the Transformer model in "Attention is All You Need".
License: MIT License
A PyTorch implementation of the Transformer model in "Attention is All You Need".
License: MIT License
Hi, thanks for the sharing.
It seems like this code does not support multi-GPUs.
So are you planning on it?
After training for the first epoch I get the following error trying to calculate training accuracy and loss:
RuntimeError: Expected object of type.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other
At the following line of code at line 102 in train.py
return total_loss/n_total_words, n_total_correct/n_total_words
Training and validation loss is nan (using commit e21800a):
$ python3 preprocess.py -train_src data/multi30k/train.en -train_tgt data/multi30k/train.de -valid_src data/multi30k/val.en -valid_tgt data/multi30k/val.de -output data/multi30k/data.pt
$ python3 train.py -data data/multi30k/data.pt -save_model trained -save_model best
[ Epoch 0 ]
- (Training) loss: nan, accuracy: 3.7 %
- (Validation) loss: nan, accuracy: 10.0 %
- [Info] The checkpoint file has been updated.
[ Epoch 1 ]
- (Training) loss: nan, accuracy: 9.09 %
- (Validation) loss: nan, accuracy: 9.87 %
[ Epoch 2 ]
- (Training) loss: nan, accuracy: 9.09 %
- (Validation) loss: nan, accuracy: 9.83 %
[ Epoch 3 ]
- (Training) loss: nan, accuracy: 9.1 %
- (Validation) loss: nan, accuracy: 9.92 %
[ Epoch 4 ]
- (Training) loss: nan, accuracy: 9.09 %
- (Validation) loss: nan, accuracy: 9.91 %
ejklektov@gpu3:~/attention-is-all-you-need-pytorch$ CUDA_VISIBLE_DEVICES=5 python3 train.py -data data/multi30k.atok.low.pt -save_model trained -save_mode best -proj_share_weight
Namespace(batch_size=64, cuda=True, d_inner_hid=1024, d_k=64, d_model=512, d_v=64, d_word_vec=512, data='data/multi30k.atok.low.pt', dropout=0.1, embs_share_weight=False, epoch=10, log=None, max_token_seq_len=52, n_head=8, n_layers=6, n_warmup_steps=4000, no_cuda=False, proj_share_weight=True, save_mode='best', save_model='trained', src_vocab_size=2909, tgt_vocab_size=3149)
/home/ejklektov/attention-is-all-you-need-pytorch/transformer/Modules.py:13: UserWarning: nn.init.xavier_normal is now deprecated in favor of nn.init.xavier_normal_.
init.xavier_normal(self.linear.weight)
/home/ejklektov/attention-is-all-you-need-pytorch/transformer/SubLayers.py:33: UserWarning: nn.init.xavier_normal is now deprecated in favor of nn.init.xavier_normal_.
init.xavier_normal(self.w_qs)
/home/ejklektov/attention-is-all-you-need-pytorch/transformer/SubLayers.py:34: UserWarning: nn.init.xavier_normal is now deprecated in favor of nn.init.xavier_normal_.
init.xavier_normal(self.w_ks)
/home/ejklektov/attention-is-all-you-need-pytorch/transformer/SubLayers.py:35: UserWarning: nn.init.xavier_normal is now deprecated in favor of nn.init.xavier_normal_.
init.xavier_normal(self.w_vs)
[ Epoch 0 ]
I change train.py 71line code,
loss.data[0] ===> loss.item[0]
but it doesn't work
Shouldn't we set dropout prob to 0.0 during prediction?
I notice that in SubLayers.py line 27, the attn_dropout was not set for ScaledDotProductAttention
For translation, I use the following command
CUDA_LAUNCH_BLOCKING=1 CUDA_VISIBLE_DEVICES=1 python3 translate.py -model trained.chkpt -vocab data/nmt.atok.low.pt -src data/nmt/test.en.atok
and get the following error : error
Can someone help?
Hi, Yu-Hsiang.
I am not clear about line 62 in file Sublayers.py
# back to original mb_size batch
outputs = outputs.view(mb_size, len_q, -1) # mb_size x len_q x (n_head*d_v)
is it right?
above code is equal to below(tensorflow implemention from Kyubyong)?
# Restore shape
outputs = tf.concat(tf.split(outputs, num_heads, axis=0), axis=2 ) # (N, T_q, C)
What is the difference between encoder self attention and decoder self attention?
Why need "get_attn_subsequent_mask" function in the decoder self attention?
Thanks for your reply in advance!
In your eval_epoch() function you feedforward src and tgt through the model just like the training phase. Is this correct? Shouldn't eval be similar to testing where the model won't know the true target? Should there be an autoregressive step for evaluation where the prediction words are generated one by one and used by subsequent predictions?
it need too many hours when training the model.how to solve it?
Hi, thanks for the implementation. It is very neat and elegant. I noticed that you mentioned "label smoothing" is not done yet, but I also found you have implemented this line. I think it is correct but I am not sure what is the num_class in sequence-to-sequence models. Should it be equal to the size of the vocabulary?
i have problem here, when training, the accuracy on training data and validation data is always zeros, can anyone help me ? thanks a lot.
As mentioned here:
PEP 257 describes good docstring conventions. Note that most importantly, the """ that ends a multiline docstring should be on a line by itself, e.g.:
"""Return a foobang
Optional plotz says to frobnicate the bizbaz first.
"""
For one liner docstrings, please keep the closing """ on the same line.
but most docstrings used in the code is:
''' document strings '''
Ubuntu Server : Ein Boston Terrier lรคuft รผber saftig-grรผnes Gras vor einem wei?^?en Zaun.
Macbook Pro : Ein Boston Terrier lรคuft รผber saftig-grรผnes Gras vor einem weiรen Zaun.
Can you tell me how to set up the language encoding in Ubuntu? Best Wishes
Hi,thanks for your sharing.
My program has broken downใ
What command can continue running program at GPU?
CUDA_VISIBLE_DEVICES=3 python train.py -data data/multi30k.atok.low.pt -save_model trained -save_mode best -proj_share_weight
This may have something to do with pytorch version (I use 0.4.0), but I think people should know:
Traceback (most recent call last): File "train.py", line 271, in <module> main() File "train.py", line 268, in main train(transformer, training_data, validation_data, crit, optimizer, opt) File "train.py", line 126, in train train_loss, train_accu = train_epoch(model, training_data, crit, optimizer) File "train.py", line 73, in train_epoch return total_loss/n_total_words, n_total_correct/n_total_words RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.cuda.LongTensor for argument #2 'other'
Hi,
I tried to force the GPU selection with CUDA_VISIBLE_DEVICES=1
but it pops an error:
RuntimeError: cublas runtime error : library not initialized at /py/conda-bld/pytorch_1490903321756/work/torch/lib/THC/THCGeneral.c:387
I think it's related to this: https://discuss.pytorch.org/t/cublas-runtime-error-library-not-initialized-at-data-users-soumith-builder-wheel-pytorch-src-torch-lib-thc-thcgeneral-c-383/1375/8
Does anyone have a pretrained model?
I am referring to this code line that implements temper in ScaledDotProductAttention:
What is this value based on? Can someone explain what it does here?
According to the paper, d_word_vec and d_model must be equal. However, the interface for Encoder allows you to set them to different values. If you initialize an Encoder and set them to different values, you get an error in Line 54 MultiHeadAttention during the forward pass.
hi, i found that in decoder there is a subsequent mask which mask out the future information here . However, in line 123, you feed in the dec_input(which is the target embeding) at first layer. now check this line and then the MultiHeadAttention moudle's forward function, it has a residual connection and will make dec_input directly reached output, see here. so it doest not use the subsequent mask, which means that the model knows the future. am i correct?
In the Beam.py-L30-L31:
self.next_ys = [self.tt.LongTensor(size).fill_(Constants.PAD)]
self.next_ys[0][0] = Constants.BOS
It seems that only the top hypothesis get "BOS" as start while all other hypothesis get "PAD" as start. Why don't all the hypothesis get "BOS" as start?
And in the Beam.py-L65-L68:
# End condition is when top-of-beam is EOS.
if self.next_ys[-1][0] == Constants.EOS:
self.done = True
self.all_scores.append(self.scores)
you set that end condition is when top-of-beam is "EOS". Why top-of-beam instead of all-of-beam?
I wonder has anyone run the code and whether encounter the same problem? @ZiJianZhao
After running the preprocess.py, only the atok.low.pt file is saved, no .dict files saved.
Hi. I just follow the tutorial to train the model with the dataset given here. However, the accuracy is relatively high at epoch 0 and has a sharp decline after that. Does anybody meet the similar problem?
Here is the record:
[ Epoch 0 ]
Hi,
I've tried to run a training on iwslt data en-fr. The first train epoch finished with loss: nan, but this may be due to my choice of parameters. The problem is, when it started the validation I got the following error:
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [109,0,0], thread: [88,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [109,0,0], thread: [89,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [109,0,0], thread: [90,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [109,0,0], thread: [91,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [109,0,0], thread: [92,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [109,0,0], thread: [93,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [109,0,0], thread: [94,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [109,0,0], thread: [95,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [52,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [53,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [54,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [55,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [96,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [97,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [98,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [99,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [100,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [101,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [102,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [103,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [104,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [105,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [106,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [107,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [108,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [109,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [110,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [111,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [112,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [113,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [114,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [115,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [116,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [117,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [118,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [119,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [120,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [121,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [122,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [123,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [124,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [125,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [126,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/THCTensorIndex.cu:321: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2]: block: [121,0,0], thread: [127,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
THCudaCheck FAIL file=/py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/generic/THCTensorMath.cu line=226 error=59 : device-side assert triggered
Traceback (most recent call last):
File "attention-is-all-you-need-pytorch/train.py", line 244, in <module>
main()
File "attention-is-all-you-need-pytorch/train.py", line 241, in main
train(transformer, training_data, validation_data, crit, optimizer, opt)
File "attention-is-all-you-need-pytorch/train.py", line 120, in train
valid_loss, valid_accu = eval_epoch(model, validation_data, crit)
File "attention-is-all-you-need-pytorch/train.py", line 85, in eval_epoch
pred = model(src, tgt)
File "/hltmt0/data/digangi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/hardmnt/hltmt0/data/digangi/attention-is-all-you-need-pytorch/transformer/Models.py", line 180, in forward
enc_outputs, enc_slf_attns = self.encoder(src_seq, src_pos)
File "/hltmt0/data/digangi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/hardmnt/hltmt0/data/digangi/attention-is-all-you-need-pytorch/transformer/Models.py", line 76, in forward
enc_output, slf_attn_mask=enc_slf_attn_mask)
File "/hltmt0/data/digangi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/hardmnt/hltmt0/data/digangi/attention-is-all-you-need-pytorch/transformer/Layers.py", line 18, in forward
enc_input, enc_input, enc_input, attn_mask=slf_attn_mask)
File "/hltmt0/data/digangi/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/hardmnt/hltmt0/data/digangi/attention-is-all-you-need-pytorch/transformer/SubLayers.py", line 43, in forward
outputs = torch.cat(outputs, 2)
File "/hltmt0/data/digangi/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 841, in cat
return Concat(dim)(*iterable)
File "/hltmt0/data/digangi/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 310, in forward
return torch.cat(inputs, self.dim)
RuntimeError: cuda runtime error (59) : device-side assert triggered at /py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/generic/THCTensorMath.cu:226
I have no experience with pytorch, so I don't know how to fix it at the moment.
Hello,
In the main Transformer model the encoder and decoder parts are calculated.
Then they are fed to a linear layer for the target word projections. But shouldn't this layer be followed by a softmax function to calculate the output probabilities like in the Transformer schematic?
Or am I looking over something? I can't seem to locate this last softmax function in the code.
Training the model throws me the error below:
python train.py -data data/multi30k.atok.low.pt -save_model trained -save_mode best -proj_share_weight
Namespace(batch_size=64, cuda=True, d_inner_hid=1024, d_k=64, d_model=512, d_v=64, d_word_vec=512, data='data/multi30k.atok.low.pt', dropout=0.1, embs_share_weight=False, epoch=10, log=None, max_token_seq_len=52, n_head=8, n_layers=6, n_warmup_steps=4000, no_cuda=False, proj_share_weight=True, save_mode='best', save_model='trained', src_vocab_size=2909, tgt_vocab_size=3150)
('[ Epoch', 0, ']')
Hi, I clone your code and run train it on WMT English-German task, but it failed with "RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1502009910772/work/torch/lib/THC/generic/THCStorage.cu:66".
I run it on a Tesla K40 which has the same memory capacity of 12GB as your Titan X, and with the default settings.
So I don`t know why this happens, do you have any idea? Thanks
On running the following command for preprocessing
for l in en de; do for f in data/multi30k/*.$l; do if [[ "$f" != *"test"* ]]; then sed -i "$ d" $f; fi; done; done;
I'm getting the following error
sed: 1: "data/multi30k/train.en": extra characters at the end of d command sed: 1: "data/multi30k/val.en": extra characters at the end of d command sed: 1: "data/multi30k/train.de": extra characters at the end of d command sed: 1: "data/multi30k/val.de": extra characters at the end of d command
Please advice as to how I should proceed
when calulate the attention in MultiHead ,softmax's dim is 0,but i think dim=2 is right.
In the origin paper, dot products will be divided by np.power(d_k, 0.5)
, but in your code, the value, that is temper, is np.power(d_model, 0.5)
. I guess this may be wrong.
Another question is that in the origin paper, d_inner_hid is 2048, but you define it 1024 as default. I don't know why.
There's a mistake when repeating data for beam search.
The source seq here
src_seq = Variable(src_seq.data.repeat(beam_size, 1))
gets a matrix in the following order for source sequence
seq1
seq2
seq3
seq1
seq2
seq3
seq1
seq2
seq3
while the beam search input here
input_data = torch.stack([b.get_current_state() for b in beam if not b.done])
takes an input like
seq1
seq1
seq1
seq2
seq2
seq2
seq3
seq3
seq3
, both of which are fed into the decoder
dec_outputs, dec_slf_attns, dec_enc_attns = self.model.decoder(
input_data, input_pos, src_seq, enc_outputs)
The order of the two input does not match.
Hi I was wondering why the maximum batch size is ~100 using a GPU with ~11GB of RAM whereas in the tensor2tensor the maximum batch size there is 1024?
Hi,
I noticed that you were masking out the padded tensor by assigning value to tensor.data
.
attn.data.masked_fill_(attn_mask, -float('inf'))
Is this correct?
Based on the discussion here, shouldn't we assign values to tensor itself, instead of tensor.data? In this way, the history of the gradient can be tracked.
Hi, Could you reproduce the results on WMT'14 datasets of "Attention is All You Need" paper ? I want to know the exact BLEU scores of your systerm on WMT'14 ENDE datasets ?
Thanks in advance .
I get 98% accuracy after 10 epochs on the multi30k validation set using this 1-layer model:
python train.py -data data/multi30k.atok.low.pt -save_model trained -save_mode best -proj_share_weight -dropout 0.0 -n_layers 1 -n_warmup_steps 40 -epoch 50 -d_inner_hid 1 -d_model 128 -d_word_vec 128 -n_head 4
This is a very small model (note -d_inner_hid 1), which should not get good results at all (98% accuracy is way too high in any case). Generating translations with translate.py produces non-sense. This makes me suspect that there is a problem with the masking code that allows the model to 'cheat' by looking at the target sequence.
I haven't been able to figure out where the problem is, but something seems wrong.
I am receiving the following error when I try to run the train script:
File "train.py", line 266, in <module>
main()
File "train.py", line 263, in main
train(transformer, training_data, validation_data, crit, optimizer, opt)
File "train.py", line 124, in train
train_loss, train_accu = train_epoch(model, training_data, crit, optimizer)
File "train.py", line 55, in train_epoch
pred = model(src, tgt)
File "/home/ubuntu/miniconda3/envs/cuda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/attention-is-all-you-need-pytorch/transformer/Models.py", line 179, in forward
enc_outputs, enc_slf_attns = self.encoder(src_seq, src_pos)
File "/home/ubuntu/miniconda3/envs/cuda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/attention-is-all-you-need-pytorch/transformer/Models.py", line 76, in forward
enc_output, slf_attn_mask=enc_slf_attn_mask)
File "/home/ubuntu/miniconda3/envs/cuda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/attention-is-all-you-need-pytorch/transformer/Layers.py", line 18, in forward
enc_input, enc_input, enc_input, attn_mask=slf_attn_mask)
File "/home/ubuntu/miniconda3/envs/cuda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/attention-is-all-you-need-pytorch/transformer/SubLayers.py", line 68, in forward
return self.layer_norm(outputs + residual), attns
File "/home/ubuntu/miniconda3/envs/cuda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/home/ubuntu/attention-is-all-you-need-pytorch/transformer/Modules.py", line 52, in forward
ln_out = (z - mu.expand_as(z)) / (sigma.expand_as(z) + self.eps)
File "/home/ubuntu/miniconda3/envs/cuda/lib/python3.6/site-packages/torch/autograd/variable.py", line 681, in expand_as
return Expand.apply(self, tensor.size())
File "/home/ubuntu/miniconda3/envs/cuda/lib/python3.6/site-packages/torch/autograd/_functions/tensor.py", line 106, in forward
result = i.expand(*new_size)
RuntimeError: The expanded size of the tensor (24) must match the existing size (64) at non-singleton dimension 1. at /home/ubuntu/cuda-ubuntu-16.04-ec2/pytorch/torch/lib/THC/generic/THCTensor.c:323
The original paper and the animation in this page seem only feed the output of the last encoding layer to the decoder, while the implementation here seems feed the output of each encoding layer to the corresponding decoding layer, which might not work if the encoder and the decoder have different number of layers.
After fixing a key error about tensor integer types, running translate.py seems to return KeyErrors with numbers, and checking with python seems to indicate that they are missing(the keys).
But skipping keys that are non-existent inside the write loop seems to return poor results.
result of pred.txt after running code
Did anybody experience this or have a fix? Thank you.
if the source target numbers are correct
how to change size of dictionaries? I use a new corpus,but its dictionary is ao big. I don not know how to change it.
how to solve it?
Hi, I am not sure if you are feeding the right input to the decoder.
(pg. 2) "Given z, the decoder then generates an output sequence (y1, ..., ym) of symbols one element at a time. At each step the model is auto-regressive, consuming the previously generated symbols as additional input when generating the next."
I believe your decoder input is a batch of target sequences.
Currently, the attention mask in the ScaledDotProductAttention is generated in Line 28 in Models.py by:
pad_attn_mask = seq_k.data.eq(Constants.PAD).unsqueeze(1)
pad_attn_mask = pad_attn_mask.expand(mb_size, len_q, len_k)
Ignoring the batch dimension for an explanation, I assume the generated pad_attn_mask is a matrix of shape (len_q * len_k), then this code will produce the matrix like [A 1], where 1 is an all one submatrix. However, I think the generated attention mask should be like [B 1 // 1 1], where 1 is an all one submatrix and // means line break (sorry I don't know how to type formula in Markdown environments).
In position_encoding_init
, shouldn't it be
[pos / np.power(10000, (i//2)*2 / d_pos_vec ) for i in range(d_pos_vec)]
instead of
[pos / np.power(10000, 2*i/d_pos_vec) for i in range(d_pos_vec)]
In the original formulation, for positions 2i
and 2i+1
, the power should be 2i / d_model
.
I notice there's a score during beam search. But the meaning of it is very ambiguous and hard to be understood. Is there any intuitive description for it?
Great work and thanks a lot. I wanted to ask why you do embeddings of the pos encoder?
I believe the pos encoder should just be added to the input embeddings, like here:
https://github.com/Kyubyong/transformer/blob/master/train.py
Let me know, thanks a lot
Traceback (most recent call last):
File "train.py", line 266, in
main()
File "train.py", line 263, in main
train(transformer, training_data, validation_data, crit, optimizer, opt)
File "train.py", line 124, in train
train_loss, train_accu = train_epoch(model, training_data, crit, optimizer)
File "train.py", line 55, in train_epoch
pred = model(src, tgt)
File "/home/sushuting/local/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/home/sushuting/workspace/attention-is-all-you-need-pytorch/transformer/Models.py", line 179, in forward
enc_outputs, enc_slf_attns = self.encoder(src_seq, src_pos)
File "/home/sushuting/local/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/home/sushuting/workspace/attention-is-all-you-need-pytorch/transformer/Models.py", line 76, in forward
enc_output, slf_attn_mask=enc_slf_attn_mask)
File "/home/sushuting/local/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/home/sushuting/workspace/attention-is-all-you-need-pytorch/transformer/Layers.py", line 18, in forward
enc_input, enc_input, enc_input, attn_mask=slf_attn_mask)
File "/home/sushuting/local/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/home/sushuting/workspace/attention-is-all-you-need-pytorch/transformer/SubLayers.py", line 62, in forward
outputs = torch.cat(torch.split(outputs, mb_size, dim=0), dim=-1)
TypeError: cat() takes no keyword arguments
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.