The end2_assignment from priyasharma2427

Select a random pair of senetences from the data we prepared

sample = random.choice(pairs) sample

=>['vous etes plus intelligent que moi .', 'you re smarter than me .']

In order to work with embedding layer and the LSTM the inputs should be in the form of tensor, So we need to convert the sentences(words) to tensors. First we'll split the sentences by whitespaces and convert each words into indices(using word2index[word])

input_sentence = sample[0] target_sentence = sample[1] input_indices = [input_lang.word2index[word] for word in input_sentence.split(' ')] target_indices = [output_lang.word2index[word] for word in target_sentence.split(' ')] input_indices, target_indices

=>([118, 214, 152, 135, 902, 42, 5], [129, 78, 1319, 1166, 343, 4])

Then convert the input_indices into tensors

input_tensor = torch.tensor(input_indices, dtype=torch.long, device= device) output_tensor = torch.tensor(target_indices, dtype=torch.long, device= device) Next, We will define a Embedding layer as well as LSTM layers for encoder

embedding = nn.Embedding(input_size, hidden_size).to(device) lstm = nn.LSTM(hidden_size, hidden_size).to(device)

We are working with 1 sample, but we would be working for a batch. Let's fix that by converting our input_tensor into a fake batch

print(embedded_input.shape) embedded_input = embedding(input_tensor[0].view(-1, 1)) print(embedded_input.shape) Let's build our LSTM, initialize the hidden state and cell state with Zeros(Empty state)

(hidden,ct) = torch.zeros(1, 1, 256, device=device),torch.zeros(1, 1, 256, device=device) embedded_input = embedding(input_tensor[0].view(-1, 1)) output, (hidden,ct) = lstm(embedded_input, (hidden,ct))

Now we will define a empty tensor with size MAX_LENGTH to store the Encoder outputs. Then we can get the encoder outputs for each of the word in the Sentence

encoder_outputs = torch.zeros(MAX_LENGTH, 256, device=device) (encoder_hidden,encoder_ct) = torch.zeros(1, 1, 256, device=device),torch.zeros(1, 1, 256, device=device)

for i in range(input_tensor.size()[0]):
embedded_input = embedding(input_tensor[i].view(-1, 1)) output, (encoder_hidden,encoder_ct) = lstm(embedded_input, (encoder_hidden,encoder_ct)) encoder_outputs[i] += output[0,0] Encoder Steps Input Sentence: vous etes plus intelligent que moi . Target Sentence: you re smarter than me . Input indices: [118, 214, 152, 135, 902, 42, 5] Target indices: [118, 214, 152, 135, 902, 42, 5] After adding the token Input indices: [118, 214, 152, 135, 902, 42, 5, 1] Target indices: [118, 214, 152, 135, 902, 42, 5, 1] Input tensor: tensor([118, 214, 152, 135, 902, 42, 5, 1], device='cuda:0') Target tensor: tensor([ 129, 78, 1319, 1166, 343, 4, 1], device='cuda:0')

Step 0 Word => vous Input Tensor => tensor(118, device='cuda:0') 08 09

Step 1 Word => etes Input Tensor => tensor(214, device='cuda:0') 10 11

Step 2 Word => plus Input Tensor => tensor(152, device='cuda:0') 12 13

Step 3 Word => intelligent Input Tensor => tensor(135, device='cuda:0') 14 15

Step 4 Word => que Input Tensor => tensor(902, device='cuda:0') 16 17

Step 5 Word => moi Input Tensor => tensor(42, device='cuda:0') 18 19

Step 6 Word => . Input Tensor => tensor(5, device='cuda:0') 20 21

Step 7 Word => Input Tensor => tensor(1, device='cuda:0') 22 23

We completed the Encoder part now, Now we can start building the Attention Decoder First input to the decoder will be SOS_token, later inputs would be the words it predicted (unless we implement teacher forcing).

Decoder/LSTM's hidden state will be initialized with the encoder's last hidden state.

We will use LSTM's hidden state and last prediction to generate attention weight using a FC layer.

This attention weight will be used to weigh the encoder_outputs using batch matric multiplication. This will give us a NEW view on how to look at encoder_states.

this attention applied encoder_states will then be concatenated with the input, and then sent a linear layer and then sent to the LSTM.

LSTM's output will be sent to a FC layer to predict one of the output_language words

first input decoder_input = torch.tensor([[SOS_token]], device=device) (decoder_hidden,decoder_ct) = (encoder_hidden,encoder_ct) decoded_words = [] The inputs to LSTM are last prediction or a word from the target sentence based on teacher forcing ratio and decoder hidden states.

We need to concatenate the embeddings and the last decoder hidden state

torch.cat((embedded[0], decoder_hidden[0]), 1).shape

=> torch.Size([1, 512]) Now we will calaculate the attentions. We will calculating the attentions by conacatinating the embeddings and last decoder hidden state and giving as input to the fully connected layer.

attn_weight_layer = nn.Linear(256 * 2, 10).to(device) \n attn_weights = attn_weight_layer(torch.cat((embedded[0], decoder_hidden[0]), 1)) \n attn_weights

=>tensor([[-0.8181, 0.0128, 0.0196, -0.3952, -0.1043, -0.1855, -0.5074, -0.4552, -0.5731, 0.5895]], device='cuda:0', grad_fn=)

Now We got the attention weights, now we'll apply the attention on the encoder outputs and combine the applied attentions and embeddings, pass through the relu, this will be input for the LSTM layer. On the output of the LSTM we'll the softmax and predict the expected word.

Decoder steps with Full teacher forcing Step 0 Expected output(word) => you Expected output(Index) => 129 Predicted output(word) => opposed Predicted output(Index) => 2669 24

Step 1 Expected output(word) => re Expected output(Index) => 78 Predicted output(word) => opposed Predicted output(Index) => 2669 25

Step 2 Expected output(word) => smarter Expected output(Index) => 1319 Predicted output(word) => opposed Predicted output(Index) => 2669 26

Step 3 Expected output(word) => than Expected output(Index) => 1166 Predicted output(word) => options Predicted output(Index) => 1343 27

Step 4 Expected output(word) => me Expected output(Index) => 343 Predicted output(word) => articulate Predicted output(Index) => 964 28

Step 5 Expected output(word) => . Expected output(Index) => 4 Predicted output(word) => forgetting Predicted output(Index) => 2345 29

priyasharma2427 / end2_assignment Goto Github PK

end2_assignment's Introduction

end2_assignment's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent