NousResearch / DisTrO
Distributed Training Over-The-Internet
☆920Updated 5 months ago
Alternatives and similar repositories for DisTrO
Users that are interested in DisTrO are comparing it to the libraries listed below
Sorting:
- OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training☆498Updated 4 months ago
- prime is a framework for efficient, globally distributed training of AI models over the internet.☆743Updated last week
- Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse …☆357Updated this week
- Minimalistic 4D-parallelism distributed training framework for education purpose☆1,464Updated 2 months ago
- A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!☆1,058Updated 3 months ago
- noise_step: Training in 1.58b With No Gradient Memory☆219Updated 4 months ago
- Pretraining code for a large-scale depth-recurrent language model☆760Updated last month
- DeMo: Decoupled Momentum Optimization☆186Updated 5 months ago
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆871Updated 2 weeks ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆439Updated 7 months ago
- procedural reasoning datasets☆580Updated this week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆307Updated 6 months ago
- Recipes to scale inference-time compute of open models☆1,071Updated last week
- Tile primitives for speedy kernels☆2,339Updated this week
- NVIDIA Linux open GPU with P2P support☆1,142Updated last week
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆492Updated this week
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆327Updated 5 months ago
- nanoGPT style version of Llama 3.1☆1,367Updated 9 months ago
- VPTQ, A Flexible and Extreme low-bit quantization algorithm☆633Updated 3 weeks ago
- Entropy Based Sampling and Parallel CoT Decoding☆3,368Updated 6 months ago
- Open weights language model from Google DeepMind, based on Griffin.☆640Updated 2 months ago
- prime-rl is a codebase for decentralized RL training at scale☆211Updated this week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,109Updated 3 months ago
- System 2 Reasoning Link Collection☆833Updated 2 months ago
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆512Updated 6 months ago
- Official implementation of Half-Quadratic Quantization (HQQ)☆810Updated this week
- An open infrastructure to democratize and decentralize the development of superintelligence for humanity.☆310Updated this week
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and full…☆608Updated last month
- Minimalistic large language model 3D-parallelism training☆1,870Updated this week
- ☆714Updated last week