SmolHub Website Data‑driven personal research portfolio

Date:

🤖 Yuvraj Singh’s AI Portfolio

🎓 Computer Science Engineering Student at IIIT Bhubaneswar (2023-2027)
🔬 Research Focus: NLP, Computer Vision, and Multimodal LLMs
🚀 Mission: Building AI from scratch and bridging research with practical implementation


🌟 What This Portfolio Showcases

This comprehensive portfolio website demonstrates my journey in AI/ML through hands-on implementations, research, and practical applications. Every model, dataset, and experiment represents hours of learning, coding, and understanding the fundamentals of artificial intelligence.

🧠 Core Philosophy

  • From Scratch Implementation: Building AI models without relying on pre-built libraries to truly understand the mathematics and algorithms
  • Paper-to-Code: Translating cutting-edge research papers into working implementations
  • Educational Focus: Creating resources that help others learn AI/ML concepts
  • Open Source: All implementations are freely available for the community

🎯 Key Features

🔬 From Scratch AI Models (38+ Implementations)

Location: /models/ Source: Paper-Replications Repository

A comprehensive collection of AI models implemented from scratch, covering:

🤖 Large Language Models & NLP

  • Transformer Architecture - The foundation of modern NLP
  • BERT - Bidirectional Encoder Representations from Transformers
  • GPT - Generative Pre-trained Transformer
  • Llama - Meta’s efficient language model architecture
  • Llama4 - Advanced multi-expert architecture with MoE
  • Kimi-K2 - DeepSeek V3 inspired model with advanced features
  • Gemma & Gemma3 - Google’s efficient language models
  • DeepSeekV3 - Latest in reasoning capabilities
  • Differential Transformer - Novel attention mechanisms
  • Attention Mechanisms - Core building blocks of modern AI
  • Encoder-Decoder - Seq2seq architectures
  • Seq2Seq - Sequence-to-sequence learning
  • Fine Tuning using PEFT - Parameter-efficient fine-tuning
  • LoRA - Low-Rank Adaptation techniques
  • DPO & ORPO - Advanced alignment techniques
  • SimplePO - Simplified preference optimization

🎨 Computer Vision & Generative Models

  • Vision Transformer (ViT) - Transformers for image classification
  • CLIP & CLiP - Vision-language understanding
  • CLAP - Contrastive Language-Audio Pre-training
  • SigLip - Sigmoid loss for language-image pre-training
  • LLaVA - Large Language and Vision Assistant
  • PaliGemma - Vision-language model
  • Generative Adversarial Networks:
    • DCGANs - Deep Convolutional GANs
    • CGANs - Conditional GANs
    • CycleGANs - Unpaired image-to-image translation
    • WGANs - Wasserstein GANs
    • Pix2Pix - Paired image-to-image translation
  • VAE - Variational Autoencoders

🧮 Fundamental Architectures

  • RNNs - Recurrent Neural Networks
  • LSTM - Long Short-Term Memory networks
  • GRU - Gated Recurrent Units
  • Mixtral - Mixture of Experts architecture

🎵 Audio & Speech

  • Whisper - Speech recognition and transcription
  • TTS - Text-to-speech synthesis
  • Moonshine - Audio processing models

Optimization & Training

  • DDP - Distributed Data Parallel training

🎮 SmolHub Playground

Location: /smolhub/ Purpose: Experimental AI playground

An interactive space for:

  • Proof-of-Concept Models - Testing new ideas quickly
  • Educational Experiments - Learning-focused implementations
  • Rapid Prototyping - Fast iteration on AI concepts
  • Community Contributions - Collaborative learning space

📊 Curated Datasets

Location: /datasets/ Count: 5+ High-quality datasets
  1. ImageNet-Mini - 10K images, 100 classes for computer vision
  2. SentimentFlow - Advanced sentiment analysis dataset
  3. CodeSense - Programming language understanding
  4. Anomaly Hunter - Anomaly detection scenarios
  5. Multilingual QA - Cross-lingual question answering

Each dataset includes:

  • Preprocessing Scripts - Ready-to-use data pipelines
  • Documentation - Comprehensive usage guides
  • Benchmarks - Performance baselines
  • Visualization Tools - Data exploration utilities

🛠️ Technical Implementation

Backend Architecture

  • Framework: Jekyll with GitHub Pages compatibility
  • Dynamic Updates: Automated model/dataset synchronization
  • Performance: Optimized for fast loading and mobile responsiveness
  • SEO: Structured data and meta optimization

AI Model Integration

  • Automatic Discovery: GitHub API integration for real-time updates
  • Code Highlighting: Syntax highlighting for multiple programming languages
  • Documentation: Auto-generated documentation from README files
  • Version Control: Git-based versioning for all implementations

Deployment

  • Platform: Render.com with automatic deployments
  • CI/CD: GitHub Actions for continuous integration
  • ZeroGPU: Hf ZeroGPU support for hosting my models as spaces!

📈 Impact & Learning Outcomes

Educational Value

  • 📚 Deep Understanding: Every model built from mathematical foundations
  • 🔍 Research Translation: Converting papers into working code
  • 🎯 Practical Application: Real-world implementation experience
  • 🤝 Community Learning: Open-source contributions for others

Technical Skills Demonstrated

  • Programming: Python, PyTorch, TensorFlow, JavaScript, Ruby
  • Mathematics: Linear Algebra, Calculus, Statistics, Probability
  • Algorithms: Deep Learning, Machine Learning, Computer Vision, NLP
  • Software Engineering: Version Control, Testing, Documentation, Deployment

Research Engagement

  • 📑 Paper Implementation: 38+ research papers translated to code
  • 🔬 Experimental Design: Systematic approach to model development
  • 📊 Performance Analysis: Benchmarking and optimization
  • 📝 Documentation: Comprehensive technical writing

🚀 Getting Started

Explore the Portfolio

  1. Visit the Website: Yuvraj’s Portfolio
  2. Browse Models: Check out /models/ for from-scratch implementations
  3. Try SmolHub: Experiment in /smolhub/ playground
  4. Download Datasets: Access curated data in /datasets/

For Developers

# Clone the repository
git clone https://github.com/YuvrajSingh-mist/yuvraj-singh-portfolio.github.io.git

# Install dependencies
bundle install

# Run locally
bundle exec jekyll serve

# Visit localhost:4000

For AI Researchers

  • 🔗 Source Code: Paper-Replications Repository
  • 📖 Documentation: Each model includes comprehensive README
  • 🤝 Collaboration: Issues and pull requests welcome
  • 📧 Contact: Reach out for research discussions

🎯 Current Focus & Future Goals

Current Work

  • 🔬 Fine-tuning LLMs - Advanced optimization techniques
  • 📚 GAN Research - Exploring generative adversarial networks
  • 🤖 Multimodal Models - Vision-language understanding
  • 📊 Model Optimization - Efficiency and performance improvements

Seeking Opportunities

  • 🎯 Research Internships - NLP and Computer Vision roles
  • 💼 Full-time Positions - RE/RS roles in AI/ML
  • 🤝 Collaborations - Open to research partnerships
  • 🎓 Mentoring - Helping others start their AI journey

🤝 Connect & Collaborate

### Professional Links

  • Email: yuvraj.mist@gmail.com
  • X: https://x.com/YuvrajS9886

    Contributing

    This portfolio is open-source and welcomes contributions:

  • 🐛 Bug Reports: Issues and suggestions
  • 💡 Feature Requests: New ideas and improvements
  • 🤝 Code Contributions: Pull requests welcome
  • 📚 Documentation: Help improve explanations

🏆 Recognition & Stats

  • 🤖 38+ AI Models implemented from scratch
  • 📊 5+ Datasets curated and documented
  • 🌟 Open Source - All code freely available
  • 🎓 Educational Impact - Helping others learn AI
  • 🚀 Active Development - Continuously updated

📜 License

This portfolio and associated code are released under the MIT License. Feel free to use, modify, and distribute with appropriate attribution.