liushunyu / awesome-direct-preference-optimization
A Survey of Direct Preference Optimization (DPO)
☆38Updated last month
Alternatives and similar repositories for awesome-direct-preference-optimization
Users that are interested in awesome-direct-preference-optimization are comparing it to the libraries listed below
Sorting:
- FusionBench: A Comprehensive Benchmark/Toolkit of Deep Model Fusion☆128Updated last week
- Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples☆33Updated last month
- [NeurIPS 2024] GITA: Graph to Image-Text Integration for Vision-Language Graph Reasoning☆49Updated 6 months ago
- Awesome-Efficient-Inference-for-LRMs is a collection of state-of-the-art, novel, exciting, token-efficient methods for Large Reasoning Mo…☆63Updated last month
- [NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"☆35Updated 4 months ago
- Accepted LLM Papers in NeurIPS 2024☆37Updated 7 months ago
- RFTT: Reasoning with Reinforced Functional Token Tuning☆27Updated last month
- [ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"☆18Updated 2 weeks ago
- AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)☆214Updated 3 weeks ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆59Updated 6 months ago
- ☆56Updated last month
- AdaMerging: Adaptive Model Merging for Multi-Task Learning. ICLR, 2024.☆80Updated 6 months ago
- ☆145Updated 8 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆76Updated 8 months ago
- Survey on Data-centric Large Language Models☆83Updated 10 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆45Updated this week
- ☆117Updated this week
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆43Updated 6 months ago
- A research repo for experiments about Reinforcement Finetuning☆46Updated last month
- [NeurIPS 2024] Official implementation for paper "Can Graph Learning Improve Planning in LLM-based Agents?"☆123Updated this week
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆92Updated last year
- A curated list of Model Merging methods.☆92Updated 8 months ago
- Awesome Low-Rank Adaptation☆38Updated 8 months ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆113Updated last week
- Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks☆16Updated 4 months ago
- OpenReivew Submission Visualization (ICLR 2024/2025)☆152Updated 7 months ago
- SEA is an automated paper review framework capable of generating comprehensive and high-quality review feedback with high consistency for…☆65Updated 5 months ago
- Awesome Learn From Model Beyond Fine-Tuning: A Survey☆63Updated 5 months ago
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"☆46Updated 4 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆119Updated last month