SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters (ICLR 2025)
☆17Aug 22, 2025Updated 7 months ago
Alternatives and similar repositories for SimPER
Users that are interested in SimPER are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Concise Reasoning via Reinforcement Learning☆13Apr 16, 2025Updated 11 months ago
- Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt…☆38Aug 11, 2024Updated last year
- Companion code to https://arxiv.org/abs/2409.03797v2☆19Sep 18, 2025Updated 6 months ago
- Code for paper "W-RAG: Weakly Supervised Dense Retrieval in RAG for Open-domain Question Answering"☆15Oct 2, 2025Updated 5 months ago
- Source code for Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach (NeurIPS 2023)☆10Dec 12, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆72Apr 2, 2025Updated 11 months ago
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- python3+django+django-rest-framework+vue+xadmin前后端分离电商平台☆10Dec 8, 2022Updated 3 years ago
- ☆20Oct 3, 2019Updated 6 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated last year
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆65Feb 19, 2026Updated last month
- PyTorch implementation of JEM++: Improved Techniques for Training JEM☆13Mar 11, 2023Updated 3 years ago
- [AAAI 2021] Slimmable Generative Adversarial Networks☆23Dec 21, 2022Updated 3 years ago
- Lightweight Adapting for Black-Box Large Language Models☆25Feb 15, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [NeurIPS 2021] Self-Supervised GANs with Label Augmentation☆22Apr 27, 2023Updated 2 years ago
- Code for [NeurIPS'2019 Spotlight] Policy Continuation with Hindsight Inverse Dynamics☆15Jan 7, 2020Updated 6 years ago
- All-in-one benchmarking platform for evaluating LLM.☆15Nov 12, 2025Updated 4 months ago
- A public repo for ICML 2021 "Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks"☆13Jul 19, 2021Updated 4 years ago
- Code for Paper: Learning Implicit Representation for Reconstructing Articulated Objects☆29Jun 5, 2024Updated last year
- Implementation of the models and datasets used in "An Information-theoretic Approach to Distribution Shifts"☆25Nov 2, 2021Updated 4 years ago
- Official implementation of Neural Episodic Control with State Abstraction☆13Aug 3, 2023Updated 2 years ago
- Code for Residual Energy-Based Models for Text Generation in PyTorch.☆26Apr 27, 2021Updated 4 years ago
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆29Oct 30, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"☆39Apr 24, 2025Updated 11 months ago
- ☆22Jan 5, 2024Updated 2 years ago
- Code for paper Causal Confusion in Imitation Learning☆46Dec 17, 2019Updated 6 years ago
- This repository contains code for the paper Direct Preference Optimization with an Offset (ODPO).☆18Feb 17, 2025Updated last year
- NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation☆13May 24, 2025Updated 10 months ago
- Python library providing function decorators for configurable backoff and retry☆25Updated this week
- Deep learning introduction to beginners with PyTorch☆12Apr 24, 2020Updated 5 years ago
- SePer is an accurate / fast / free-of-API metric to measure document quality via information gain☆31Feb 22, 2026Updated last month
- 💼 Browser extension - Update your bookmarks with site descriptions☆12Sep 4, 2017Updated 8 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆50Oct 23, 2024Updated last year
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Jul 19, 2024Updated last year
- ☆12Sep 25, 2018Updated 7 years ago
- Reference implementation of algorithms for reinforcement learning and Markov decision processes.☆12Jan 28, 2021Updated 5 years ago
- Ἀνατομή is a PyTorch library to analyze representation of neural networks☆13Jan 31, 2024Updated 2 years ago
- Repo for the EACL2017 tutorial on imitation learning☆28Apr 3, 2017Updated 8 years ago
- Paper reading logs☆11Feb 26, 2022Updated 4 years ago