Directional Preference Alignment
☆58Sep 23, 2024Updated last year
Alternatives and similar repositories for Directional-Preference-Alignment
Users that are interested in Directional-Preference-Alignment are comparing it to the libraries listed below
Sorting:
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆39Sep 22, 2024Updated last year
- This is an official implementation of the paper ``Building Math Agents with Multi-Turn Iterative Preference Learning'' with multi-turn DP…☆32Dec 5, 2024Updated last year
- Codebase for Iterative DPO Using Rule-based Rewards☆270Apr 11, 2025Updated 11 months ago
- Recipes to train reward model for RLHF.☆1,521Apr 24, 2025Updated 10 months ago
- [ICLR 2025] This repository contains the code to reproduce the results from our paper From Sparse Dependence to Sparse Attention: Unveili…☆12Mar 7, 2025Updated last year
- ☆27Mar 13, 2024Updated 2 years ago
- This is the code used for the paper "PMGT-VR: A decentralized proximal-gradient algorithmic framework with variance reduction", prepint.☆16Jul 2, 2022Updated 3 years ago
- RewardBench: the first evaluation tool for reward models.☆704Feb 16, 2026Updated last month
- Official Implementation of Nabla-GFlowNet (ICLR 2025)☆28May 3, 2025Updated 10 months ago
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Jun 25, 2024Updated last year
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆79Jun 10, 2025Updated 9 months ago
- ☆282Jan 6, 2025Updated last year
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆329Jan 29, 2026Updated last month
- ☆21Jul 25, 2025Updated 7 months ago
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆118Oct 23, 2023Updated 2 years ago
- code and data associated with CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations☆11Oct 13, 2023Updated 2 years ago
- ☆32Sep 28, 2025Updated 5 months ago
- This is the official repo for our CVPR22 paper: Scalable Penalized Regression for Noise Detection in Learning With Noisy Labels.☆19Mar 21, 2024Updated 2 years ago
- We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.☆61Oct 3, 2024Updated last year
- ☆13Jul 2, 2025Updated 8 months ago
- ☆16Jul 29, 2025Updated 7 months ago
- ☆11Oct 3, 2021Updated 4 years ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆37Oct 3, 2025Updated 5 months ago
- A project designed to build and render a full Minecraft crafting tree.☆10Aug 10, 2021Updated 4 years ago
- Visualization of mean field and neural tangent kernel regime☆23Jul 25, 2024Updated last year
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆21Jun 13, 2025Updated 9 months ago
- ☆321Sep 18, 2024Updated last year
- Official repo of Progressive Data Expansion: data, code and evaluation☆29Nov 16, 2023Updated 2 years ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- ☆30Feb 16, 2024Updated 2 years ago
- ☆14May 3, 2022Updated 3 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆10Dec 30, 2024Updated last year
- baikal.ai's pre-trained BERT models: descriptions and sample codes☆12Jun 24, 2021Updated 4 years ago
- I-SHEEP: Iterative Self-enHancEmEnt Paradigm of LLMs through Self-Instruct and Self-Assessment☆17Jan 16, 2025Updated last year
- Code for the paper "A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis"☆19Jun 12, 2025Updated 9 months ago
- Official implementation of "VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis"☆20Jan 26, 2025Updated last year
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆47Apr 15, 2025Updated 11 months ago
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- [NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward☆946Feb 16, 2025Updated last year