Mixtral
Mixtral
Overview
From scratch implementation of Mixtral
Technical Details
- Framework: PyTorch
- Dataset: TinyShakespeare
- Category: Language Models
Implementation Details
I implemented the Mixtral architecture from scratch using Pytorch on Tinyshakespeare dataset.
Datasets
Tineshakespeare: in the /data folder
Frameworks:
Pytorch
Results (on T4 GPU Single)
Training steps: 1000 IValidation steps: per 50 training steps
Train loss: 2.0422 Val loss: 2.0898
Source Code
๐ GitHub Repository: Mixtral
View the complete implementation, training scripts, and documentation on GitHub.