Open-Social-World / autolibraLinks
AutoLibra: Metric Induction for Agents from Open-Ended Human Feedback
☆16Updated last month
Alternatives and similar repositories for autolibra
Users that are interested in autolibra are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective☆39Updated 2 months ago
- ☆20Updated 2 weeks ago
- ☆10Updated 2 years ago
- ☆32Updated 6 months ago
- ☆19Updated last year
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆27Updated last year
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆69Updated 7 months ago
- ☆13Updated 4 months ago
- ☆30Updated last year
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Updated 3 months ago
- ☆25Updated 7 months ago
- ☆17Updated 3 months ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆44Updated 7 months ago
- MUA-RL: MULTI-TURN USER-INTERACTING AGENT REINFORCEMENT LEARNING FOR AGENTIC TOOL USE☆44Updated 2 weeks ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Updated 7 months ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆49Updated last year
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆31Updated 9 months ago
- ☆15Updated last year
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆28Updated last year
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆23Updated last month
- Plancraft is a minecraft environment and agent suite to test planning capabilities in LLMs☆21Updated last week
- ☆49Updated 3 months ago
- Rewarded soups official implementation☆62Updated 2 years ago
- Sotopia-RL: Reward Design for Social Intelligence☆43Updated 3 months ago
- ☆21Updated 2 months ago
- [ICML 2025] Official code of "AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization"☆22Updated last year
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆10Updated last year
- Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)☆47Updated 6 months ago
- [ICLR 2025] Official code of "Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization"☆18Updated last year
- A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models☆17Updated 5 months ago