Mixtral

Mixtral

Category: Language Models
Framework: PyTorch
Dataset: TinyShakespeare
Created: March 20, 2025

Overview

From scratch implementation of Mixtral

Technical Details

  • Framework: PyTorch
  • Dataset: TinyShakespeare
  • Category: Language Models

Implementation Details

I implemented the Mixtral architecture from scratch using Pytorch on Tinyshakespeare dataset.

Mixtral of Experts

Datasets

Tineshakespeare: in the /data folder

Frameworks:

Pytorch

Results (on T4 GPU Single)

Training steps: 1000 IValidation steps: per 50 training steps

Train loss: 2.0422 Val loss: 2.0898

Source Code

๐Ÿ“ GitHub Repository: Mixtral

View the complete implementation, training scripts, and documentation on GitHub.