holarissun / embedding-based-llm-alignmentLinks
Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
☆19Updated 4 months ago
Alternatives and similar repositories for embedding-based-llm-alignment
Users that are interested in embedding-based-llm-alignment are comparing it to the libraries listed below
Sorting:
- Links to publications that focus on the interpretation and analysis of in-context learning☆11Updated 10 months ago
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆16Updated 7 months ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆66Updated 5 months ago
- Teaching Models to Express Their Uncertainty in Words☆39Updated 3 years ago
- Code for ACL2024 paper - Adversarial Preference Optimization (APO).☆56Updated last year
- The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".☆17Updated last year
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆93Updated last year
- ☆43Updated 5 months ago
- [ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…☆26Updated 11 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆123Updated 11 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆47Updated 10 months ago
- Domain-specific preference (DSP) data and customized RM fine-tuning.☆25Updated last year
- ☆56Updated 3 months ago
- Explore what LLMs are really leanring over SFT☆29Updated last year
- GenRM-CoT: Data release for verification rationales☆65Updated 10 months ago
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆58Updated last year
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆37Updated 11 months ago
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆60Updated last month
- Code for the paper "A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis"☆19Updated 2 months ago
- This is the official implementation of ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting☆21Updated last year
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆147Updated 6 months ago
- Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)☆192Updated last year
- ☆30Updated last year
- Offical code of the paper Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Le…☆75Updated last year
- ☆51Updated last year
- ☆74Updated 9 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMs