huggingface / smollmLinks
Everything about the SmolLM and SmolVLM family of models
β2,803Updated last week
Alternatives and similar repositories for smollm
Users that are interested in smollm are comparing it to the libraries listed below
Sorting:
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.β1,309Updated 2 months ago
- Recipes for shrinking, optimizing, customizing cutting edge vision models. πβ1,520Updated last week
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.β1,498Updated this week
- Sky-T1: Train your own O1 preview model within $450β3,305Updated this week
- The Open Cookbook for Top-Tier Code Large Language Modelβ1,754Updated 7 months ago
- nanoGPT style version of Llama 3.1β1,397Updated 11 months ago
- Code for BLT research paperβ1,736Updated last month
- Synthetic data curation for post-training and structured data extractionβ1,446Updated last week
- Democratizing Reinforcement Learning for LLMsβ3,801Updated this week
- Fast State-of-the-Art Static Embeddingsβ1,756Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β2,806Updated this week
- Large Concept Models: Language modeling in a sentence representation spaceβ2,246Updated 5 months ago
- The simplest, fastest repository for training/finetuning small-sized VLMs.β3,726Updated last week
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMsβ3,274Updated last month
- Textbook on reinforcement learning from human feedbackβ1,097Updated this week
- Bringing BERT into modernity via both architecture changes and scalingβ1,442Updated 2 weeks ago
- A Self-adaptation Frameworkπ that adapts LLMs for unseen tasks in real-time!β1,126Updated 5 months ago
- [CVPR 2025] Magma: A Foundation Model for Multimodal AI Agentsβ1,754Updated last month
- NanoGPT (124M) in 3 minutesβ2,811Updated this week
- Optimizing inference proxy for LLMsβ2,615Updated last week
- Implementing DeepSeek R1's GRPO algorithm from scratchβ1,479Updated 3 months ago
- s1: Simple test-time scalingβ6,501Updated 3 weeks ago
- Run PyTorch LLMs locally on servers, desktop and mobileβ3,600Updated last week
- LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoningβ2,026Updated this week
- LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speeβ¦β2,957Updated last month
- Things you can do with the token embeddings of an LLMβ1,445Updated 3 months ago
- Codebase for Aria - an Open Multimodal Native MoEβ1,058Updated 5 months ago
- DataComp for Language Modelsβ1,324Updated 3 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsβ1,722Updated last week
- Minimalistic large language model 3D-parallelism trainingβ2,034Updated last week