NousResearch / DisTrO
Distributed Training Over-The-Internet
β862Updated last month
Alternatives and similar repositories for DisTrO:
Users that are interested in DisTrO are comparing it to the libraries listed below
- OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Trainingβ424Updated 2 weeks ago
- prime is a framework for efficient, globally distributed training of AI models over the internet.β626Updated this week
- A Self-adaptation Frameworkπ that adapts LLMs for unseen tasks in real-time!β801Updated last week
- VPTQ, A Flexible and Extreme low-bit quantization algorithmβ569Updated last week
- noise_step: Training in 1.58b With No Gradient Memoryβ214Updated last month
- Training Large Language Model to Reason in a Continuous Latent Spaceβ735Updated this week
- Open weights language model from Google DeepMind, based on Griffin.β614Updated 6 months ago
- Optimizing inference proxy for LLMsβ1,955Updated this week
- Recipes to scale inference-time compute of open modelsβ971Updated last week
- Synthetic Data curation for post-training and structured data extractionβ539Updated this week
- NVIDIA Linux open GPU with P2P supportβ993Updated last month
- veRL: Volcano Engine Reinforcement Learning for LLMβ1,135Updated this week
- Official implementation of Half-Quadratic Quantization (HQQ)β736Updated 2 weeks ago
- Minimalistic 4D-parallelism distributed training framework for education purposeβ670Updated this week
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"β833Updated last week
- Minimalistic large language model 3D-parallelism trainingβ1,400Updated this week
- Testing baseline LLMs performance across various modelsβ211Updated this week
- A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizationsβ845Updated 2 months ago
- A comprehensive repository of reasoning tasks for LLMs (and beyond)β394Updated 4 months ago
- nanoGPT style version of Llama 3.1β1,300Updated 5 months ago
- This repo contains the source code for RULER: Whatβs the Real Context Size of Your Long-Context Language Models?β866Updated last month
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.β280Updated 3 months ago
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLMβ881Updated this week
- Tile primitives for speedy kernelsβ1,966Updated this week
- β497Updated 5 months ago
- β780Updated 4 months ago
- Everything about the SmolLM2 and SmolVLM family of modelsβ1,632Updated this week
- GRadient-INformed MoEβ261Updated 4 months ago
- A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fullβ¦β605Updated 2 months ago
- Code for BLT research paperβ1,352Updated this week