Aleph-Alpha / scaling

Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for training large language models.

☆58

Alternatives and similar repositories for scaling:

Users that are interested in scaling are comparing it to the libraries listed below

EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆103Updated 4 months ago
google-deepmind / mishax
☆124Updated last week
epfml / DenseFormer
☆79Updated 11 months ago
EleutherAI / improved-t5
Experiments for efforts to train a new and improved t5
☆77Updated 11 months ago
AblateIt / finetune-study
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Updated last year
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆91Updated 3 weeks ago
google-deepmind / asyncdiloco
☆43Updated last year
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆85Updated 2 weeks ago
NousResearch / StripedHyenaTrainer
☆60Updated last year
idiap / sigma-gpt
σ-GPT: A New Approach to Autoregressive Models
☆62Updated 7 months ago
euclaise / supertrainer2000
☆49Updated last year
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆108Updated 3 months ago
arcee-ai / DAM
☆48Updated 4 months ago
SalesforceAIResearch / LaTRO
☆111Updated last month
apple / ml-planner
☆47Updated last year
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated last month
kanishkg / stream-of-search
Repository for the paper Stream of Search: Learning to Search in Language
☆142Updated 2 months ago
cohere-ai / magikarp
Code for the paper "Fishing for Magikarp"
☆151Updated 2 weeks ago
jonhue / activeft
PyTorch library for Active Fine-Tuning
☆62Updated last month
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆81Updated 3 weeks ago
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆91Updated this week
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆80Updated 3 years ago
JoshEngels / MultiDimensionalFeatures
Code for reproducing our paper "Not All Language Model Features Are Linear"
☆73Updated 4 months ago
KhoomeiK / complexity-scaling
gzip Predicts Data-dependent Scaling Laws
☆34Updated 10 months ago
joey00072 / ohara
Collection of autoregressive model implementation
☆83Updated last month
yixiaoer / tpux
A set of Python scripts that makes your experience on TPU better
☆50Updated 9 months ago
kyleliang919 / Online-Subspace-Descent
This repo is based on https://github.com/jiaweizzhao/GaLore
☆26Updated 6 months ago
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆253Updated 8 months ago
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆96Updated last year
cognitivecomputations / spectrum
☆112Updated 6 months ago