mlfoundations / open_lm
A repository for research on medium sized language models.
☆479Updated this week
Related projects ⓘ
Alternatives and complementary repositories for open_lm
- Scaling Data-Constrained Language Models☆321Updated last month
- ☆451Updated 3 weeks ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆647Updated last month
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆476Updated 3 weeks ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆293Updated 11 months ago
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆744Updated this week
- Inference code for Persimmon-8B☆416Updated last year
- Minimalistic large language model 3D-parallelism training☆1,260Updated this week
- Transformers with Arbitrarily Large Context☆641Updated 3 months ago
- batched loras☆336Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆178Updated 3 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.☆257Updated 4 months ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆537Updated 6 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆252Updated last year
- Scalable toolkit for efficient model alignment☆620Updated this week
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆293Updated 5 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆236Updated 4 months ago
- A bibliography and survey of the papers surrounding o1☆754Updated this week
- RewardBench: the first evaluation tool for reward models.☆431Updated 3 weeks ago
- Official PyTorch implementation of QA-LoRA☆117Updated 8 months ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆438Updated 8 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆307Updated 7 months ago
- A bagel, with everything.☆312Updated 7 months ago
- 🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash…☆193Updated this week
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆558Updated 8 months ago
- PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention…☆280Updated 6 months ago
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends☆811Updated this week
- ☆470Updated 2 months ago