LLM360 / k2-trainLinks
☆50Updated 11 months ago
Alternatives and similar repositories for k2-train
Users that are interested in k2-train are comparing it to the libraries listed below
Sorting:
- My fork os allen AI's OLMo for educational purposes.☆30Updated 5 months ago
- Verifiers for LLM Reinforcement Learning☆55Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- ☆79Updated 9 months ago
- ☆47Updated 9 months ago
- A repository for research on medium sized language models.☆76Updated last year
- This is the official repository for Inheritune.☆111Updated 3 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆69Updated 2 weeks ago
- Exploration of automated dataset selection approaches at large scales.☆41Updated 3 months ago
- ☆46Updated 3 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆28Updated 8 months ago
- ☆65Updated 2 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆98Updated last month
- ☆72Updated last month
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆31Updated 2 months ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆110Updated 2 weeks ago
- General Reasoner: Advancing LLM Reasoning Across All Domains☆126Updated this week
- Aioli: A unified optimization framework for language model data mixing☆25Updated 4 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆47Updated last year
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆32Updated 2 months ago
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆149Updated last month
- The first dense retrieval model that can be prompted like an LM☆73Updated 3 weeks ago
- ☆34Updated 11 months ago
- ☆49Updated 6 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆51Updated 5 months ago
- ☆50Updated 7 months ago
- ☆56Updated 2 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆59Updated last month
- Language models scale reliably with over-training and on downstream tasks☆97Updated last year
- The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"☆53Updated last week