Mixtral

Category: Language Models

Framework: PyTorch

Dataset: TinyShakespeare

Created: March 20, 2025

GitHub: View Implementation

Overview

From scratch implementation of Mixtral

I implemented the Mixtral architecture from scratch using Pytorch on Tinyshakespeare dataset.

Tineshakespeare: in the /data folder

Pytorch

Training steps: 1000 IValidation steps: per 50 training steps

Train loss: 2.0422 Val loss: 2.0898

📁 GitHub Repository: Mixtral

View the complete implementation, training scripts, and documentation on GitHub.