Transformer

Transformer

Category: Language Models
Framework: PyTorch
Dataset: Custom
Created: March 10, 2025

Overview

From scratch implementation of Transformer

Key Features

  • Attention Mechanism
  • Transformer Architecture

Technical Details

  • Framework: PyTorch
  • Dataset: Custom
  • Category: Language Models

Implementation Details

I implemented the Vanilla Transformers using Pytorch on the German-English dataset.

Attention Is All You Need

Datasets

Multi30k de-en: Link

Frameworks:

Pytorch

Results (on T4 GPU Single)

Training epochs: 3 Val epochs: 5

Train loss: 0.02 (mean) Val loss: 0.03 (mean)

Source Code

๐Ÿ“ GitHub Repository: Transformer

View the complete implementation, training scripts, and documentation on GitHub.