Transformer
Transformer
Overview
From scratch implementation of Transformer
Key Features
- Attention Mechanism
- Transformer Architecture
Technical Details
- Framework: PyTorch
- Dataset: Custom
- Category: Language Models
Implementation Details
I implemented the Vanilla Transformers using Pytorch on the German-English dataset.
Datasets
Multi30k de-en: Link
Frameworks:
Pytorch
Results (on T4 GPU Single)
Training epochs: 3 Val epochs: 5
Train loss: 0.02 (mean) Val loss: 0.03 (mean)
Source Code
๐ GitHub Repository: Transformer
View the complete implementation, training scripts, and documentation on GitHub.