mlfoundations / open_lm
A repository for research on medium sized language models.
☆493Updated 2 months ago
Alternatives and similar repositories for open_lm:
Users that are interested in open_lm are comparing it to the libraries listed below
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆705Updated 5 months ago
- Scaling Data-Constrained Language Models☆333Updated 6 months ago
- ☆501Updated 4 months ago
- Minimalistic large language model 3D-parallelism training☆1,701Updated this week
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆298Updated last year
- Official repository for ORPO☆445Updated 9 months ago
- Large Context Attention☆693Updated 2 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆253Updated 8 months ago
- Scalable toolkit for efficient model alignment☆750Updated this week
- batched loras☆340Updated last year
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆817Updated 2 weeks ago
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆506Updated 4 months ago
- A bagel, with everything.☆317Updated 11 months ago
- Multipack distributed sampler for fast padding-free training of LLMs☆186Updated 7 months ago
- distributed trainer for LLMs☆567Updated 10 months ago
- RewardBench: the first evaluation tool for reward models.☆526Updated 3 weeks ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆405Updated 11 months ago
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,216Updated 2 weeks ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆454Updated last year
- Official PyTorch implementation of QA-LoRA☆129Updated last year
- Codebase for Merging Language Models (ICML 2024)☆801Updated 10 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆393Updated 10 months ago
- The official evaluation suite and dynamic data release for MixEval.☆233Updated 4 months ago
- ☆512Updated 7 months ago
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆597Updated last year
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆855Updated last month
- [ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning☆646Updated 9 months ago
- [ICML 2024] CLLMs: Consistency Large Language Models☆386Updated 4 months ago
- Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates☆448Updated 11 months ago
- Extend existing LLMs way beyond the original training length with constant memory usage, without retraining☆691Updated 11 months ago