holarissun / embedding-based-llm-alignmentLinks
Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
☆17Updated 2 months ago
Alternatives and similar repositories for embedding-based-llm-alignment
Users that are interested in embedding-based-llm-alignment are comparing it to the libraries listed below
Sorting:
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆62Updated 2 months ago
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆16Updated 5 months ago
- Domain-specific preference (DSP) data and customized RM fine-tuning.☆25Updated last year
- The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".☆17Updated last year
- [ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning☆21Updated last year
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆54Updated last year
- ☆55Updated last month
- ☆30Updated last year
- Links to publications that focus on the interpretation and analysis of in-context learning☆10Updated 8 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆45Updated 8 months ago
- ☆40Updated last year
- GenRM-CoT: Data release for verification rationales☆61Updated 8 months ago
- ☆39Updated 3 months ago
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆59Updated 6 months ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆26Updated last year
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆28Updated last year
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆62Updated 11 months ago
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆30Updated last year
- Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)☆13Updated 8 months ago
- WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…☆47Updated last year
- Directional Preference Alignment☆57Updated 9 months ago
- Teaching Models to Express Their Uncertainty in Words☆39Updated 3 years ago
- ☆74Updated last year
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆29Updated 3 weeks ago
- ☆33Updated 9 months ago
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆12Updated last year
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆81Updated 10 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆121Updated 9 months ago
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆32Updated 9 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆72Updated 2 years ago