NousResearch / DisTrO
Distributed Training Over-The-Internet
☆881Updated 3 months ago
Alternatives and similar repositories for DisTrO:
Users that are interested in DisTrO are comparing it to the libraries listed below
- OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training☆455Updated last month
- prime is a framework for efficient, globally distributed training of AI models over the internet.☆663Updated this week
- VPTQ, A Flexible and Extreme low-bit quantization algorithm☆593Updated this week
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆977Updated last month
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆849Updated last week
- ☆1,006Updated 2 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆872Updated this week
- noise_step: Training in 1.58b With No Gradient Memory☆216Updated 2 months ago
- Muon optimizer: +>30% sample efficiency with <3% wallclock overhead☆434Updated this week
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆421Updated 5 months ago
- Aidan Bench attempts to measure <big_model_smell> in LLMs.☆279Updated last week
- Open weights language model from Google DeepMind, based on Griffin.☆622Updated last week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆294Updated 4 months ago
- Long context evaluation for large language models☆200Updated last week
- ☆815Updated 5 months ago
- Textbook on reinforcement learning from human feedback☆466Updated this week
- Official implementation of Half-Quadratic Quantization (HQQ)☆760Updated last week
- This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?☆951Updated last month
- ☆679Updated last month
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,255Updated 2 weeks ago
- Everything about the SmolLM2 and SmolVLM family of models☆1,969Updated last week
- DeMo: Decoupled Momentum Optimization☆181Updated 3 months ago
- Optimizing inference proxy for LLMs☆2,070Updated this week
- Minimalistic large language model 3D-parallelism training☆1,630Updated this week
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM☆1,032Updated this week
- Recipes to scale inference-time compute of open models☆1,019Updated last week