holarissun / embedding-based-llm-alignment
Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
☆15Updated 3 weeks ago
Alternatives and similar repositories for embedding-based-llm-alignment
Users that are interested in embedding-based-llm-alignment are comparing it to the libraries listed below
Sorting:
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆58Updated last month
- The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".☆16Updated 10 months ago
- ☆54Updated last year
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆16Updated 4 months ago
- [ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning☆21Updated last year
- Teaching Models to Express Their Uncertainty in Words☆39Updated 2 years ago
- GenRM-CoT: Data release for verification rationales☆59Updated 7 months ago
- ☆30Updated last year
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆28Updated last year
- ☆40Updated last year
- Domain-specific preference (DSP) data and customized RM fine-tuning.☆25Updated last year
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆46Updated last year
- Explore what LLMs are really leanring over SFT☆28Updated last year
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆60Updated 2 years ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆76Updated last year
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆25Updated 3 months ago
- ☆46Updated this week
- ☆36Updated last month
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆120Updated 8 months ago
- ☆53Updated 2 years ago
- ☆17Updated 11 months ago
- ☆50Updated last year
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆29Updated 7 months ago
- Forcing Diffuse Distributions out of Language Models☆15Updated 8 months ago
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆12Updated 11 months ago
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆30Updated 11 months ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆26Updated last year
- ☆74Updated 11 months ago
- ☆49Updated last year
- ☆11Updated 10 months ago