0xallam / Direct-Preference-OptimizationLinks
Direct Preference Optimization from scratch in PyTorch
☆116Updated 6 months ago
Alternatives and similar repositories for Direct-Preference-Optimization
Users that are interested in Direct-Preference-Optimization are comparing it to the libraries listed below
Sorting:
- A Survey on Data Selection for Language Models☆250Updated 5 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆259Updated last year
- ☆211Updated 8 months ago
- Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`☆190Updated 2 months ago
- ☆271Updated last year
- Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"☆522Updated 9 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆148Updated 8 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆121Updated last year
- ☆323Updated 4 months ago
- Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models☆109Updated 3 months ago
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆192Updated last year
- Code for STaR: Bootstrapping Reasoning With Reasoning (NeurIPS 2022)☆214Updated 2 years ago
- ☆280Updated 9 months ago
- Critique-out-Loud Reward Models☆70Updated last year
- [ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning☆496Updated last year
- LLM-Merging: Building LLMs Efficiently through Merging☆204Updated last year
- Function Vectors in Large Language Models (ICLR 2024)☆181Updated 6 months ago
- Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.☆439Updated last year
- RewardBench: the first evaluation tool for reward models.☆643Updated 4 months ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆82Updated 9 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆266Updated last year
- ☆210Updated 7 months ago
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆125Updated 5 months ago
- For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.☆157Updated this week
- A brief and partial summary of RLHF algorithms.☆132Updated 7 months ago
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆92Updated last year
- The repo for In-context Autoencoder☆148Updated last year
- Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning☆165Updated last year
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆174Updated 5 months ago
- Must-read Papers on Large Language Model (LLM) Continual Learning☆146Updated last year