mlfoundations / open_lmLinks
A repository for research on medium sized language models.
☆495Updated 3 weeks ago
Alternatives and similar repositories for open_lm
Users that are interested in open_lm are comparing it to the libraries listed below
Sorting:
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆302Updated last year
- Scaling Data-Constrained Language Models☆334Updated 8 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆257Updated 10 months ago
- ☆517Updated 6 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.☆267Updated 3 weeks ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆725Updated 8 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆254Updated last year
- Inference code for Persimmon-8B☆415Updated last year
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆513Updated 2 weeks ago
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆461Updated last year
- Large Context Attention☆711Updated 4 months ago
- Multipack distributed sampler for fast padding-free training of LLMs☆188Updated 9 months ago
- A bagel, with everything.☆320Updated last year
- ☆536Updated 9 months ago
- Scalable toolkit for efficient model alignment☆803Updated last week
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling☆876Updated last month
- RuLES: a benchmark for evaluating rule-following in language models☆224Updated 3 months ago
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆611Updated last year
- Official PyTorch implementation of QA-LoRA☆135Updated last year
- A project to improve skills of large language models☆413Updated this week
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding☆1,249Updated 2 months ago
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆849Updated last week
- The official evaluation suite and dynamic data release for MixEval.☆241Updated 6 months ago
- distributed trainer for LLMs☆575Updated last year
- Official repository for ORPO☆453Updated last year
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆333Updated 5 months ago
- DSIR large-scale data selection framework for language model training☆249Updated last year
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆408Updated 4 months ago
- RewardBench: the first evaluation tool for reward models.☆582Updated this week
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆396Updated last year