arcee-ai / DAMLinks
☆55Updated last year
Alternatives and similar repositories for DAM
Users that are interested in DAM are comparing it to the libraries listed below
Sorting:
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆62Updated last year
- A repository for research on medium sized language models.☆77Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆35Updated last year
- Simple GRPO scripts and configurations.☆59Updated 11 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 2 months ago
- ☆39Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆80Updated last year
- Verifiers for LLM Reinforcement Learning☆80Updated 8 months ago
- Data preparation code for CrystalCoder 7B LLM☆45Updated last year
- ☆48Updated last year
- entropix style sampling + GUI☆27Updated last year
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆27Updated last year
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆31Updated 7 months ago
- Code for NeurIPS LLM Efficiency Challenge☆59Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆37Updated 3 months ago
- When Reasoning Meets Its Laws☆33Updated last week
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 8 months ago
- Aioli: A unified optimization framework for language model data mixing☆32Updated 11 months ago
- Simple repository for training small reasoning models☆47Updated 11 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆111Updated last year
- ☆53Updated 11 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- Source code for the collaborative reasoner research project at Meta FAIR.☆111Updated 8 months ago
- Train your own SOTA deductive reasoning model☆107Updated 10 months ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Updated last year
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆70Updated 2 years ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆51Updated last year
- ☆17Updated 9 months ago
- Small, simple agent task environments for training and evaluation☆19Updated last year