This repository includes a simple implementation of the original transformer architecture. Additionally, it features a decoder-only transformer that was trained on Shakespearean text. Using a straightforward character-level tokenization, the model can generate Shakespearean-themed text.
- "Attention is All You Need" by Vaswani et al.