NousResearch / DisTrOLinks
Distributed Training Over-The-Internet
☆935Updated 3 weeks ago
Alternatives and similar repositories for DisTrO
Users that are interested in DisTrO are comparing it to the libraries listed below
Sorting:
- OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training☆504Updated 4 months ago
- prime is a framework for efficient, globally distributed training of AI models over the internet.☆757Updated 2 weeks ago
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆447Updated this week
- Deep learning for dummies. All the practical details and useful utilities that go into working with real models.☆795Updated last month
- ☆722Updated 2 weeks ago
- Open weights language model from Google DeepMind, based on Griffin.☆640Updated this week
- System 2 Reasoning Link Collection☆835Updated 2 months ago
- prime-rl is a codebase for decentralized async RL training at scale☆318Updated this week
- ☆536Updated 9 months ago
- Official implementation of Half-Quadratic Quantization (HQQ)☆818Updated this week
- procedural reasoning datasets☆770Updated this week
- Minimalistic large language model 3D-parallelism training☆1,898Updated last week
- Long context evaluation for large language models☆213Updated 3 months ago
- ☆895Updated 8 months ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆443Updated 8 months ago
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,518Updated this week
- nanoGPT style version of Llama 3.1☆1,372Updated 9 months ago
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆879Updated last month
- VPTQ, A Flexible and Extreme low-bit quantization algorithm☆639Updated last month
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM☆1,441Updated this week
- UNet diffusion model in pure CUDA☆606Updated 11 months ago
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆310Updated 7 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆981Updated 10 months ago
- Testing baseline LLMs performance across various models☆270Updated last week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆333Updated 5 months ago
- Pretraining code for a large-scale depth-recurrent language model☆776Updated last week
- ☆536Updated 7 months ago
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,301Updated last month
- The Autograd Engine☆609Updated 8 months ago
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,074Updated 4 months ago