want to play around with the transformer, but I'm confused with shapes.
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 57) 0
__________________________________________________________________________________________________
embedding_2 (Embedding) (None, 57, 300) 865200 input_1[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 57, 300) 90000 embedding_2[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 57, 300) 90000 embedding_2[0][0]
__________________________________________________________________________________________________
lambda_3 (Lambda) (None, 57) 0 input_1[0][0]
__________________________________________________________________________________________________
lambda_4 (Lambda) (None, None, None) 0 dense_1[0][0]
__________________________________________________________________________________________________
lambda_5 (Lambda) (None, None, None) 0 dense_2[0][0]
__________________________________________________________________________________________________
lambda_7 (Lambda) (None, 57) 0 lambda_3[0][0]
__________________________________________________________________________________________________
input_2 (InputLayer) (None, 57) 0
__________________________________________________________________________________________________
lambda_8 (Lambda) (None, None, None) 0 lambda_4[0][0]
lambda_5[0][0]
__________________________________________________________________________________________________
lambda_9 (Lambda) (None, 57) 0 lambda_7[0][0]
__________________________________________________________________________________________________
lambda_1 (Lambda) (None, 56) 0 input_2[0][0]
__________________________________________________________________________________________________
add_1 (Add) (None, None, None) 0 lambda_8[0][0]
lambda_9[0][0]
__________________________________________________________________________________________________
embedding_3 (Embedding) (None, 56, 300) 865200 lambda_1[0][0]
__________________________________________________________________________________________________
lambda_12 (Lambda) (None, 56, 56) 0 lambda_1[0][0]
__________________________________________________________________________________________________
lambda_13 (Lambda) (None, None, None) 0 lambda_1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation) (None, None, None) 0 add_1[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 57, 300) 90000 embedding_2[0][0]
__________________________________________________________________________________________________
dense_5 (Dense) (None, 56, 300) 90000 embedding_3[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 56, 300) 90000 embedding_3[0][0]
__________________________________________________________________________________________________
lambda_14 (Lambda) (None, 56, 56) 0 lambda_12[0][0]
lambda_13[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, None, None) 0 activation_1[0][0]
__________________________________________________________________________________________________
lambda_6 (Lambda) (None, None, None) 0 dense_3[0][0]
__________________________________________________________________________________________________
lambda_16 (Lambda) (None, None, None) 0 dense_5[0][0]
__________________________________________________________________________________________________
lambda_17 (Lambda) (None, None, None) 0 dense_6[0][0]
__________________________________________________________________________________________________
lambda_19 (Lambda) (None, 56, 56) 0 lambda_14[0][0]
__________________________________________________________________________________________________
lambda_10 (Lambda) (None, None, None) 0 dropout_1[0][0]
lambda_6[0][0]
__________________________________________________________________________________________________
lambda_20 (Lambda) (None, None, None) 0 lambda_16[0][0]
lambda_17[0][0]
__________________________________________________________________________________________________
lambda_21 (Lambda) (None, 56, 56) 0 lambda_19[0][0]
__________________________________________________________________________________________________
lambda_11 (Lambda) (None, None, 300) 0 lambda_10[0][0]
__________________________________________________________________________________________________
add_4 (Add) (None, None, None) 0 lambda_20[0][0]
lambda_21[0][0]
__________________________________________________________________________________________________
time_distributed_1 (TimeDistrib (None, None, 300) 90300 lambda_11[0][0]
__________________________________________________________________________________________________
activation_2 (Activation) (None, None, None) 0 add_4[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 56, 300) 90000 embedding_3[0][0]
__________________________________________________________________________________________________
dropout_6 (Dropout) (None, None, 300) 0 time_distributed_1[0][0]
__________________________________________________________________________________________________
dropout_3 (Dropout) (None, None, None) 0 activation_2[0][0]
__________________________________________________________________________________________________
lambda_18 (Lambda) (None, None, None) 0 dense_7[0][0]
__________________________________________________________________________________________________
add_2 (Add) (None, None, 300) 0 embedding_2[0][0]
dropout_6[0][0]
__________________________________________________________________________________________________
lambda_22 (Lambda) (None, None, None) 0 dropout_3[0][0]
lambda_18[0][0]
__________________________________________________________________________________________________
layer_normalization_2 (LayerNor (None, None, 300) 600 add_2[0][0]
__________________________________________________________________________________________________
lambda_23 (Lambda) (None, None, 300) 0 lambda_22[0][0]
__________________________________________________________________________________________________
conv1d_1 (Conv1D) (None, None, 512) 154112 layer_normalization_2[0][0]
__________________________________________________________________________________________________
time_distributed_2 (TimeDistrib (None, None, 300) 90300 lambda_23[0][0]
__________________________________________________________________________________________________
conv1d_2 (Conv1D) (None, None, 300) 153900 conv1d_1[0][0]
__________________________________________________________________________________________________
dropout_7 (Dropout) (None, None, 300) 0 time_distributed_2[0][0]
__________________________________________________________________________________________________
dropout_2 (Dropout) (None, None, 300) 0 conv1d_2[0][0]
__________________________________________________________________________________________________
add_5 (Add) (None, None, 300) 0 embedding_3[0][0]
dropout_7[0][0]
__________________________________________________________________________________________________
add_3 (Add) (None, None, 300) 0 dropout_2[0][0]
layer_normalization_2[0][0]
__________________________________________________________________________________________________
layer_normalization_4 (LayerNor (None, None, 300) 600 add_5[0][0]
__________________________________________________________________________________________________
layer_normalization_1 (LayerNor (None, None, 300) 600 add_3[0][0]
__________________________________________________________________________________________________
dense_9 (Dense) (None, None, 300) 90000 layer_normalization_4[0][0]
__________________________________________________________________________________________________
dense_10 (Dense) (None, None, 300) 90000 layer_normalization_1[0][0]
__________________________________________________________________________________________________
lambda_15 (Lambda) (None, 56, 57) 0 lambda_1[0][0]
input_1[0][0]
__________________________________________________________________________________________________
lambda_24 (Lambda) (None, None, None) 0 dense_9[0][0]
__________________________________________________________________________________________________
lambda_25 (Lambda) (None, None, None) 0 dense_10[0][0]
__________________________________________________________________________________________________
lambda_27 (Lambda) (None, 56, 57) 0 lambda_15[0][0]
__________________________________________________________________________________________________
lambda_28 (Lambda) (None, None, None) 0 lambda_24[0][0]
lambda_25[0][0]
__________________________________________________________________________________________________
lambda_29 (Lambda) (None, 56, 57) 0 lambda_27[0][0]
__________________________________________________________________________________________________
add_6 (Add) (None, None, None) 0 lambda_28[0][0]
lambda_29[0][0]
__________________________________________________________________________________________________
activation_3 (Activation) (None, None, None) 0 add_6[0][0]
__________________________________________________________________________________________________
dense_11 (Dense) (None, None, 300) 90000 layer_normalization_1[0][0]
__________________________________________________________________________________________________
dropout_4 (Dropout) (None, None, None) 0 activation_3[0][0]
__________________________________________________________________________________________________
lambda_26 (Lambda) (None, None, None) 0 dense_11[0][0]
__________________________________________________________________________________________________
lambda_30 (Lambda) (None, None, None) 0 dropout_4[0][0]
lambda_26[0][0]
__________________________________________________________________________________________________
lambda_31 (Lambda) (None, None, 300) 0 lambda_30[0][0]
__________________________________________________________________________________________________
time_distributed_3 (TimeDistrib (None, None, 300) 90300 lambda_31[0][0]
__________________________________________________________________________________________________
dropout_8 (Dropout) (None, None, 300) 0 time_distributed_3[0][0]
__________________________________________________________________________________________________
add_7 (Add) (None, None, 300) 0 layer_normalization_4[0][0]
dropout_8[0][0]
__________________________________________________________________________________________________
layer_normalization_5 (LayerNor (None, None, 300) 600 add_7[0][0]
__________________________________________________________________________________________________
conv1d_3 (Conv1D) (None, None, 512) 154112 layer_normalization_5[0][0]
__________________________________________________________________________________________________
conv1d_4 (Conv1D) (None, None, 300) 153900 conv1d_3[0][0]
__________________________________________________________________________________________________
dropout_5 (Dropout) (None, None, 300) 0 conv1d_4[0][0]
__________________________________________________________________________________________________
add_8 (Add) (None, None, 300) 0 dropout_5[0][0]
layer_normalization_5[0][0]
__________________________________________________________________________________________________
layer_normalization_3 (LayerNor (None, None, 300) 600 add_8[0][0]
__________________________________________________________________________________________________
time_distributed_4 (TimeDistrib (None, None, 57) 17100 layer_normalization_3[0][0]
==================================================================================================
Total params: 3,447,424
Trainable params: 3,447,424
Non-trainable params: 0
__________________________________________________________________________________________________```
I wanna input the train data and output the exact same sentence as input.
how do I do it?