A curated list of awesome resources about reward construction for AI agents. This repository covers cutting-edge research, and practical guides on defining and collecting rewards to build more intelligent and aligned AI agents.
☆55Sep 1, 2025Updated 6 months ago
Alternatives and similar repositories for Awesome-Agent-RL
Users that are interested in Awesome-Agent-RL are comparing it to the libraries listed below
Sorting:
- UnifiedToolHub is a comprehensive project supporting LLM-based tool use, designed to unify various tool-use dataset formats and provide t…☆19Jul 23, 2025Updated 7 months ago
- We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that S…☆270Feb 21, 2026Updated 2 weeks ago
- VehicleWorld is the first comprehensive multi-device environment for intelligent vehicle interaction that accurately models the complex, …☆21Sep 16, 2025Updated 5 months ago
- ☆38Oct 2, 2024Updated last year
- ☆35Jun 17, 2025Updated 8 months ago
- Open-source Traditional Chinese Medical Large Language Models. (开源中文医疗大模型合集)☆49Oct 5, 2025Updated 5 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆71Jul 13, 2025Updated 7 months ago
- Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning☆136Mar 3, 2026Updated last week
- Scripts & Code patches for analyzing/emulating/copying FM1208 CPU Cards (读取复制 SAK28 CPU卡 FM1208)☆20Mar 7, 2025Updated last year
- Content Moderation using Reality.Eth with Kleros arbitration☆12Feb 19, 2025Updated last year
- MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols☆17Nov 19, 2025Updated 3 months ago
- Simple snippet database☆13Nov 19, 2024Updated last year
- Starter template for a Hybrid App using a Next.js Server and React Frontend built with vite☆18Feb 3, 2025Updated last year
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- Document intricacies of using WinDBG to aid Rust project development☆17Nov 19, 2024Updated last year
- COSMOS is a computational tool crafted to overcome the challenges associated with integrating spatially resolved multi-omics data. This …☆13Nov 12, 2024Updated last year
- A General Quantum Software☆18Feb 24, 2026Updated last week
- ☆12Jun 21, 2025Updated 8 months ago
- Antimicrobial Peptide Structural Evolution Miner (AMP-SEMiner), an integrated AI framework designed for the simultaneous identification o…☆13May 10, 2025Updated 10 months ago
- Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.☆10May 16, 2024Updated last year
- ☆36Jun 13, 2023Updated 2 years ago
- a Video Quality Analysis Toolkit☆13May 16, 2025Updated 9 months ago
- Backend services for an AI-powered, privacy-first team collaboration platform. Manages secure data, AI processing, and real-time communic…☆18Oct 16, 2025Updated 4 months ago
- FamilyTool benchmark☆12Sep 10, 2025Updated 6 months ago
- LobotoMl is a set of scripts and tools to assess production deployments of ML services☆10May 16, 2022Updated 3 years ago
- Cellular content mining and particle localization☆10Sep 21, 2025Updated 5 months ago
- ☆15Feb 5, 2025Updated last year
- Information Extraction related tools and models☆10Mar 16, 2023Updated 2 years ago
- R1V, trained with AI feedback, answers open-ended visual questions.☆14Apr 12, 2025Updated 10 months ago
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 4 months ago
- A proof-of-concept to demonstrate randomized execution paths and their impact on call stack signatures — ideal for EDR testing, behavior-…☆25Jan 17, 2026Updated last month
- ☆11Sep 7, 2023Updated 2 years ago
- A python tool help to interact with chatgpt.☆10Dec 11, 2022Updated 3 years ago
- Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"☆33Nov 1, 2025Updated 4 months ago
- MOSS-Speech is a true speech-to-speech large language model without text guidance.☆126Feb 13, 2026Updated 3 weeks ago
- Unofficial Iranian hackers group disk wiper malware aka "Shamoon" in .NET 2.0☆13Dec 23, 2018Updated 7 years ago
- Official codebase for the NeurIPS 2023 paper: Towards Last-layer Retraining for Group Robustness with Fewer Annotations. https://arxiv.or…☆12May 15, 2024Updated last year
- This is the implementation of paper "Learning to Ask Conversational Questions by Optimizing Levenshtein Distance".☆10Jul 5, 2021Updated 4 years ago
- [NeurIPS25] RULE: Reinforcement UnLEarning Achieves Forge-retain Pareto Optimality☆19Oct 22, 2025Updated 4 months ago