bespokelabsai / awesome-rlLinks
☆16Updated 5 months ago
Alternatives and similar repositories for awesome-rl
Users that are interested in awesome-rl are comparing it to the libraries listed below
Sorting:
- Scaling Data for SWE-agents☆403Updated last week
- Improving Alignment and Robustness with Circuit Breakers☆233Updated 11 months ago
- Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".☆269Updated 3 months ago
- ☆204Updated 6 months ago
- Persona Vectors: Monitoring and Controlling Character Traits in Language Models☆230Updated last month
- The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆166Updated 2 months ago
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆91Updated 9 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆244Updated 4 months ago
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆196Updated last year
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆461Updated last week
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆77Updated 6 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 8 months ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆87Updated 7 months ago
- A toolkit for describing model features and intervening on those features to steer behavior.☆202Updated 10 months ago
- A simple unified framework for evaluating LLMs☆246Updated 5 months ago
- For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.☆147Updated last week
- ⚖️ Awesome LLM Judges ⚖️☆127Updated 4 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆183Updated 6 months ago
- ☆190Updated 5 months ago
- Framework and toolkits for building and evaluating collaborative agents that can work together with humans.☆98Updated 5 months ago
- Official code repository for Sketch-of-Thought (SoT)☆127Updated 4 months ago
- ☆117Updated 4 months ago
- [ICML 2025] Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.☆47Updated 4 months ago
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..☆269Updated 6 months ago
- ☆122Updated 7 months ago
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]☆183Updated 3 weeks ago
- ☆25Updated 8 months ago
- Automatic evals for LLMs☆526Updated 2 months ago
- This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…☆51Updated 9 months ago
- ☆78Updated 8 months ago