This repository contains the implementation of a Transformer model for translating English (eng) to German (deu) using attention.
This project is the implementation of paper : Attention is all you need
The model architecture includes components such as cross-attention, decoder, encoder, feedforward, self-attention, and transformer.
The project model is saved at models/
and includes visualizations for accuracy, loss, and learning rate at directory assets/images
.
You can also find the images of architecture for cross-attention, decoder, encoder, feedforward, self-attention, and transformer within directory assets/model architecture
.
Clone the repository:
git clone https://github.com/yourusername/transformer-deu-to-eng.git
cd transformer-deu-to-eng
The transformer model consists of the following components:
- Encoder: Processes the input sequence.
- Decoder: Generates the output sequence.
- Self-Attention: Allows the model to focus on different parts of the input sequence.
- Cross-Attention: Allows the decoder to focus on relevant parts of the input sequence.
- Feedforward: Adds non-linearity and complexity to the model.
- Transformer: transformer itself.
- text_pairs.pickle: For pairing text among deu and eng.
- vectorize.pickle: Vectorizing each sentence.
- posenc-2048-512.pickle: Position encoding for the sentence.
- The implementation is based on the paper "Attention Is All You Need" by Vaswani et al. Attention is all you need
- Thanks to the contributors of open-source libraries such as TensorFlow.