SmolHub Website Data‑driven personal research portfolio

Date: August 23, 2025

🤖 Yuvraj Singh’s AI Portfolio

🎓 Computer Science Engineering Student at IIIT Bhubaneswar (2023-2027)
🔬 Research Focus: NLP, Computer Vision, and Multimodal LLMs
🚀 Mission: Building AI from scratch and bridging research with practical implementation

🌟 What This Portfolio Showcases

This comprehensive portfolio website demonstrates my journey in AI/ML through hands-on implementations, research, and practical applications. Every model, dataset, and experiment represents hours of learning, coding, and understanding the fundamentals of artificial intelligence.

🧠 Core Philosophy

From Scratch Implementation: Building AI models without relying on pre-built libraries to truly understand the mathematics and algorithms
Paper-to-Code: Translating cutting-edge research papers into working implementations
Educational Focus: Creating resources that help others learn AI/ML concepts
Open Source: All implementations are freely available for the community

🎯 Key Features

🔬 From Scratch AI Models (38+ Implementations)

Location: /models/ Source: Paper-Replications Repository

A comprehensive collection of AI models implemented from scratch, covering:

🤖 Large Language Models & NLP

Transformer Architecture - The foundation of modern NLP
BERT - Bidirectional Encoder Representations from Transformers
GPT - Generative Pre-trained Transformer
Llama - Meta’s efficient language model architecture
Llama4 - Advanced multi-expert architecture with MoE
Kimi-K2 - DeepSeek V3 inspired model with advanced features
Gemma & Gemma3 - Google’s efficient language models
DeepSeekV3 - Latest in reasoning capabilities
Differential Transformer - Novel attention mechanisms
Attention Mechanisms - Core building blocks of modern AI
Encoder-Decoder - Seq2seq architectures
Seq2Seq - Sequence-to-sequence learning
Fine Tuning using PEFT - Parameter-efficient fine-tuning
LoRA - Low-Rank Adaptation techniques
DPO & ORPO - Advanced alignment techniques
SimplePO - Simplified preference optimization

🎨 Computer Vision & Generative Models

Vision Transformer (ViT) - Transformers for image classification
CLIP & CLiP - Vision-language understanding
CLAP - Contrastive Language-Audio Pre-training
SigLip - Sigmoid loss for language-image pre-training
LLaVA - Large Language and Vision Assistant
PaliGemma - Vision-language model
Generative Adversarial Networks:
- DCGANs - Deep Convolutional GANs
- CGANs - Conditional GANs
- CycleGANs - Unpaired image-to-image translation
- WGANs - Wasserstein GANs
- Pix2Pix - Paired image-to-image translation
VAE - Variational Autoencoders

🧮 Fundamental Architectures

RNNs - Recurrent Neural Networks
LSTM - Long Short-Term Memory networks
GRU - Gated Recurrent Units
Mixtral - Mixture of Experts architecture

🎵 Audio & Speech

Whisper - Speech recognition and transcription
TTS - Text-to-speech synthesis
Moonshine - Audio processing models

⚡ Optimization & Training

DDP - Distributed Data Parallel training

🎮 SmolHub Playground

Location: /smolhub/ Purpose: Experimental AI playground

An interactive space for:

Proof-of-Concept Models - Testing new ideas quickly
Educational Experiments - Learning-focused implementations
Rapid Prototyping - Fast iteration on AI concepts
Community Contributions - Collaborative learning space

📊 Curated Datasets

Location: /datasets/ Count: 5+ High-quality datasets

Featured Datasets:

ImageNet-Mini - 10K images, 100 classes for computer vision
SentimentFlow - Advanced sentiment analysis dataset
CodeSense - Programming language understanding
Anomaly Hunter - Anomaly detection scenarios
Multilingual QA - Cross-lingual question answering

Each dataset includes:

✅ Preprocessing Scripts - Ready-to-use data pipelines
✅ Documentation - Comprehensive usage guides
✅ Benchmarks - Performance baselines
✅ Visualization Tools - Data exploration utilities

🛠️ Technical Implementation

Backend Architecture

Framework: Jekyll with GitHub Pages compatibility
Dynamic Updates: Automated model/dataset synchronization
Performance: Optimized for fast loading and mobile responsiveness
SEO: Structured data and meta optimization

AI Model Integration

Automatic Discovery: GitHub API integration for real-time updates
Code Highlighting: Syntax highlighting for multiple programming languages
Documentation: Auto-generated documentation from README files
Version Control: Git-based versioning for all implementations

Deployment

Platform: Render.com with automatic deployments
CI/CD: GitHub Actions for continuous integration
ZeroGPU: Hf ZeroGPU support for hosting my models as spaces!

📈 Impact & Learning Outcomes

Educational Value

📚 Deep Understanding: Every model built from mathematical foundations
🔍 Research Translation: Converting papers into working code
🎯 Practical Application: Real-world implementation experience
🤝 Community Learning: Open-source contributions for others

Technical Skills Demonstrated

Programming: Python, PyTorch, TensorFlow, JavaScript, Ruby
Mathematics: Linear Algebra, Calculus, Statistics, Probability
Algorithms: Deep Learning, Machine Learning, Computer Vision, NLP
Software Engineering: Version Control, Testing, Documentation, Deployment

Research Engagement

📑 Paper Implementation: 38+ research papers translated to code
🔬 Experimental Design: Systematic approach to model development
📊 Performance Analysis: Benchmarking and optimization
📝 Documentation: Comprehensive technical writing

🚀 Getting Started

Explore the Portfolio

Visit the Website: Yuvraj’s Portfolio
Browse Models: Check out /models/ for from-scratch implementations
Try SmolHub: Experiment in /smolhub/ playground
Download Datasets: Access curated data in /datasets/

For Developers

# Clone the repository
git clone https://github.com/YuvrajSingh-mist/yuvraj-singh-portfolio.github.io.git

# Install dependencies
bundle install

# Run locally
bundle exec jekyll serve

# Visit localhost:4000

For AI Researchers

🔗 Source Code: Paper-Replications Repository
📖 Documentation: Each model includes comprehensive README
🤝 Collaboration: Issues and pull requests welcome
📧 Contact: Reach out for research discussions

🎯 Current Focus & Future Goals

Current Work

🔬 Fine-tuning LLMs - Advanced optimization techniques
📚 GAN Research - Exploring generative adversarial networks
🤖 Multimodal Models - Vision-language understanding
📊 Model Optimization - Efficiency and performance improvements

Seeking Opportunities

🎯 Research Internships - NLP and Computer Vision roles
💼 Full-time Positions - RE/RS roles in AI/ML
🤝 Collaborations - Open to research partnerships
🎓 Mentoring - Helping others start their AI journey

🤝 Connect & Collaborate

### Professional Links

Email: yuvraj.mist@gmail.com
X: https://x.com/YuvrajS9886
Contributing

This portfolio is open-source and welcomes contributions:
🐛 Bug Reports: Issues and suggestions
💡 Feature Requests: New ideas and improvements
🤝 Code Contributions: Pull requests welcome
📚 Documentation: Help improve explanations

🏆 Recognition & Stats

🤖 38+ AI Models implemented from scratch
📊 5+ Datasets curated and documented
🌟 Open Source - All code freely available
🎓 Educational Impact - Helping others learn AI
🚀 Active Development - Continuously updated

📜 License

This portfolio and associated code are released under the MIT License. Feel free to use, modify, and distribute with appropriate attribution.

Share on

Twitter Facebook LinkedIn

Yuvraj Singh