Aleph-Alpha / scaling
Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for training large language models.
☆58Updated 5 months ago
Alternatives and similar repositories for scaling:
Users that are interested in scaling are comparing it to the libraries listed below
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆103Updated 4 months ago
- ☆124Updated last week
- ☆79Updated 11 months ago
- Experiments for efforts to train a new and improved t5☆77Updated 11 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 3 weeks ago
- ☆43Updated last year
- EvaByte: Efficient Byte-level Language Models at Scale☆85Updated 2 weeks ago
- ☆60Updated last year
- σ-GPT: A New Approach to Autoregressive Models☆62Updated 7 months ago
- ☆49Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆108Updated 3 months ago
- ☆48Updated 4 months ago
- ☆111Updated last month
- ☆47Updated last year
- Simple GRPO scripts and configurations.☆59Updated last month
- Repository for the paper Stream of Search: Learning to Search in Language☆142Updated 2 months ago
- Code for the paper "Fishing for Magikarp"☆151Updated 2 weeks ago
- PyTorch library for Active Fine-Tuning☆62Updated last month
- Train your own SOTA deductive reasoning model☆81Updated 3 weeks ago
- nanoGPT-like codebase for LLM training☆91Updated this week
- some common Huggingface transformers in maximal update parametrization (µP)☆80Updated 3 years ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆73Updated 4 months ago
- gzip Predicts Data-dependent Scaling Laws☆34Updated 10 months ago
- Collection of autoregressive model implementation☆83Updated last month
- A set of Python scripts that makes your experience on TPU better☆50Updated 9 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆26Updated 6 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆253Updated 8 months ago
- Language models scale reliably with over-training and on downstream tasks☆96Updated last year
- ☆112Updated 6 months ago