RL algorithm: Advantage induced policy alignment
☆66Aug 11, 2023Updated 2 years ago
Alternatives and similar repositories for RLHF-APA
Users that are interested in RLHF-APA are comparing it to the libraries listed below
Sorting:
- ☆15Jun 29, 2024Updated last year
- ☆11Jun 13, 2023Updated 2 years ago
- Standalone service for asynchronous and real-time event stream processing. Supports input event streams from syslog, dbus, user-events an…☆13Jun 10, 2023Updated 2 years ago
- This data set contains accelerometer and gyroscope recordings from over 200 participants performing various gym exercises. This data set …☆33Jun 16, 2023Updated 2 years ago
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- [IEEE TMI 2024] Prototype-Guided Graph Reasoning Network for Few-Shot Medical Image Segmentation☆12Jun 13, 2025Updated 8 months ago
- Scalable Educational Experiences with Digital Scaffolding☆14Apr 11, 2025Updated 10 months ago
- To gain access, please finish setting up this repository now at: https://repos.opensource.microsoft.com/microsoft/wizard?existingreponam…☆11Jun 13, 2023Updated 2 years ago
- Woodgrove groceries custom authentication extension REST API demo☆36Oct 10, 2025Updated 4 months ago
- Self-Alignment with Principle-Following Reward Models☆169Sep 18, 2025Updated 5 months ago
- Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.☆15Sep 4, 2024Updated last year
- Rust libraries for Linux Tracepoints and user_events☆17Dec 3, 2025Updated 2 months ago
- This repositorie es the code of the paper Optimizing Reusable Knowledge for Continual Learning via Metalearning.☆11Oct 12, 2021Updated 4 years ago
- The official source code for "Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling" (ACL 2024, Findings)☆14Aug 12, 2024Updated last year
- Direct preference optimization with f-divergences.☆16Nov 3, 2024Updated last year
- [IEEE TBD 2023] IEMask R-CNN: Information-enhanced Mask R-CNN☆16Mar 14, 2023Updated 2 years ago
- ☆33Oct 31, 2024Updated last year
- ☆35Jan 29, 2023Updated 3 years ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- Repository for the paper Stream of Search: Learning to Search in Language☆154Feb 3, 2025Updated last year
- Accelerating the development of large multimodal models (LMMs) with lmms-eval☆14Oct 14, 2024Updated last year
- Woodgrove groceries demo web application☆75Aug 7, 2025Updated 6 months ago
- Code and data for "Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change" (EMNLP2022)☆18Dec 8, 2022Updated 3 years ago
- Data Science Toolkit - Knowledge Mining Solution Accelerator☆23Nov 22, 2023Updated 2 years ago
- Official code for the paper: "Metadata Archaeology"☆19May 10, 2023Updated 2 years ago
- Training GPTs to solve interaction nets☆18Aug 14, 2024Updated last year
- Content built for the community, with love, by The Fabric Customer Advisory Team!☆26Aug 18, 2025Updated 6 months ago
- Code to accompany the paper "The Information Geometry of Unsupervised Reinforcement Learning"☆20Oct 6, 2021Updated 4 years ago
- Implementation of MixCE method described in ACL 2023 paper by Zhang et al.☆20May 29, 2023Updated 2 years ago
- Graph Transformers for Large Graphs☆22Apr 26, 2024Updated last year
- Official implementation of TBA for async LLM post-training.☆29Nov 5, 2025Updated 3 months ago
- GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators☆47Dec 23, 2025Updated 2 months ago
- ☆123Feb 21, 2025Updated last year
- ☆27Jul 23, 2025Updated 7 months ago
- Code for "The Expressive Power of Low-Rank Adaptation".☆20Apr 19, 2024Updated last year
- ☆20Jun 3, 2023Updated 2 years ago
- Invite OpenAI to your teams calls to assist w/ QnA right in chat.☆26Jan 9, 2024Updated 2 years ago
- ☆20Oct 25, 2022Updated 3 years ago
- Super fast implementations of common benchmark text world games☆52Aug 25, 2025Updated 6 months ago