official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
☆71Apr 2, 2025Updated 11 months ago
Alternatives and similar repositories for RewardModelingBeyondBradleyTerry
Users that are interested in RewardModelingBeyondBradleyTerry are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving☆24Aug 25, 2025Updated 6 months ago
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆47Aug 21, 2024Updated last year
- A Benchmark for Multi-Stage Legal Case Documents Generation☆16Feb 24, 2025Updated last year
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆13Jun 15, 2024Updated last year
- Repository of paper "Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis" (ACL 2025 Main)☆19Jul 19, 2025Updated 7 months ago
- Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.☆11Apr 5, 2023Updated 2 years ago
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)☆15Feb 1, 2025Updated last year
- Counterfactual Evaluation and Learning for Interactive Systems: Foundations, Implementations, and Recent Advances☆12Aug 14, 2022Updated 3 years ago
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆175Apr 23, 2025Updated 10 months ago
- A public repo for ICML 2021 "Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks"☆13Jul 19, 2021Updated 4 years ago
- ☆13Dec 12, 2025Updated 3 months ago
- ☆12Jun 30, 2024Updated last year
- A framework to train language models to learn invariant representations.☆14Jan 24, 2022Updated 4 years ago
- ☆78Nov 6, 2025Updated 4 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆185Jul 23, 2025Updated 7 months ago
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated last year
- The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)☆16Feb 11, 2023Updated 3 years ago
- ☆15Jan 11, 2022Updated 4 years ago
- Knowledge transfer from high-resource to low-resource programming languages for Code LLMs☆16Aug 12, 2025Updated 7 months ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- ☆60Mar 8, 2026Updated last week
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆28Mar 1, 2025Updated last year
- ☆33Jul 9, 2025Updated 8 months ago
- The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"☆37Oct 1, 2025Updated 5 months ago
- ☆16Oct 21, 2024Updated last year
- A general framework used on evaluating the performance of large language models (LLMs) based on the peer review mechanism among LLMs☆19Aug 3, 2024Updated last year
- SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters (ICLR 2025)☆17Aug 22, 2025Updated 6 months ago
- ☆64Apr 9, 2024Updated last year
- Code and data from the paper 'Human Feedback is not Gold Standard'☆20Mar 6, 2026Updated last week
- A simple implementation of ReasonGenRM.☆19Apr 21, 2025Updated 10 months ago
- ☆18Oct 8, 2024Updated last year
- ☆19Feb 25, 2024Updated 2 years ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆52Jul 15, 2025Updated 8 months ago
- Experiment for Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning☆26Jan 16, 2023Updated 3 years ago
- ☆16Jul 23, 2024Updated last year
- [Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments☆190Jan 12, 2026Updated 2 months ago
- code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning☆43Mar 20, 2024Updated last year
- (NeurIPS 2025) Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"☆47Jun 3, 2025Updated 9 months ago
- GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators☆47Dec 23, 2025Updated 2 months ago