tongjingqi/Awesome-Agent-RL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tongjingqi/Awesome-Agent-RL)

tongjingqi / Awesome-Agent-RL

A curated list of awesome resources about reward construction for AI agents. This repository covers cutting-edge research, and practical guides on defining and collecting rewards to build more intelligent and aligned AI agents.

☆59

Alternatives and similar repositories for Awesome-Agent-RL

Users that are interested in Awesome-Agent-RL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenMOSS / UnifiedToolHub
View on GitHub
UnifiedToolHub is a comprehensive project supporting LLM-based tool use, designed to unify various tool-use dataset formats and provide t…
☆22Jul 23, 2025Updated 11 months ago
OpenMOSS / VehicleWorld
View on GitHub
VehicleWorld is the first comprehensive multi-device environment for intelligent vehicle interaction that accurately models the complex, …
☆24Sep 16, 2025Updated 9 months ago
tongjingqi / MathTrap
View on GitHub
In this work, we investigate the compositionality of large language models (LLMs) in mathematical reasoning. Specifically, we construct a…
☆60Mar 15, 2025Updated last year
Jihuai-wpy / InferAligner
View on GitHub
Inference-time alignment for harmlessness through cross-model guidance (ACL 2024). Code + MM-Harmful Bench.
☆38Oct 2, 2024Updated last year
q1sun / Tutorial-AI4SC-SC4AI
View on GitHub
Where Scientific Computing Meets Artificial Intellegence
☆50Apr 30, 2026Updated 2 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
JT-Ushio / AI-Infra-Seminar
View on GitHub
☆24Jul 20, 2025Updated 11 months ago
tongjingqi / Game-RL
View on GitHub
Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning
☆155Jun 19, 2026Updated 2 weeks ago
Callione / LLaVA-MOSS2
View on GitHub
Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.
☆13Sep 19, 2024Updated last year
ziplab / TriSplat
View on GitHub
TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction
☆325Jun 12, 2026Updated 3 weeks ago
shiqichen17 / SPA
View on GitHub
Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"
☆35Nov 1, 2025Updated 8 months ago
Phospheneser / Phospheneser-awesome-academic-template
View on GitHub
An open-source personal academic homepage template characterized by its user-friendly design and extensive scalability.
☆36Oct 6, 2025Updated 9 months ago
lyyf2002 / ASGEA
View on GitHub
Code for ASGEA: Exploiting Logic Rules from Align-Subgraphs for Entity Alignment
☆12Feb 28, 2024Updated 2 years ago
ShiyuNee / Awesome-LMs-Perception-of-Their-Knowledge-Boundaries-Papers
View on GitHub
This is a repo consisting of papers about LLMs' perception of their knowledge boundaries; Uncertainty Quantification; Honesty Alignment; …
☆25Nov 25, 2025Updated 7 months ago
dange-academic / python_igraph_tutorial
View on GitHub
python igraph tutorial
☆11Nov 23, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
GAIR-NLP / SR-Scientist
View on GitHub
[ICLR 2026] SR-Scientist: Scientific Equation Discovery With Agentic AI
☆49Jan 27, 2026Updated 5 months ago
Hambaobao / Marathon
View on GitHub
Marathon: A Multiple-choice Long Context Evaluation Benchmark for Large Language Models.
☆10May 16, 2024Updated 2 years ago
hkust-nlp / mstar
View on GitHub
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆75Jul 13, 2025Updated 11 months ago
ljcleo / agent_sense
View on GitHub
Benchmarking Social Intelligence of Language Agents through Interactive Scenarios
☆13Jan 4, 2025Updated last year
OliverLeeXZ / OPT-BENCH
View on GitHub
[ACL 2026] OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces
☆125May 12, 2026Updated last month
hualoveshao / TCM_entity_recognition
View on GitHub
中医药术语识别，使用CNN-BILSTM-CRF模型对9000条训练数据和1000条测试数据进行处理，最终测试数据正确率为90+%。为方便使用，使用Tkinter对模型进行封装使用
☆12Feb 21, 2020Updated 6 years ago
HVision-NKU / ControlSR
View on GitHub
☆13Apr 19, 2025Updated last year
LunarShen / FastVID
View on GitHub
[NeurIPS 2025] FastVID: Dynamic Density Pruning for Fast Video Large Language Models
☆36Nov 10, 2025Updated 7 months ago
chenlong-clock / RULE-Unlearn
View on GitHub
[NeurIPS25] RULE: Reinforcement UnLEarning Achieves Forge-retain Pareto Optimality
☆20Oct 22, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
weizhangltt / dual-recommend
View on GitHub
☆11Oct 31, 2021Updated 4 years ago
ShawnTan86 / TokenCarve
View on GitHub
This is the open-source code for TokenCarve.
☆25Jan 23, 2026Updated 5 months ago
microsoft / tale-suite
View on GitHub
Text Adventure Learning Environment Suite - Benchmark to evaluate language models on interactive text environments.
☆30May 9, 2026Updated last month
Mashiro2000 / volunteer
View on GitHub
适配青龙面板/云函数/本地运行志愿汇益动星空活动脚本
☆17Aug 12, 2022Updated 3 years ago
WOW5678 / CompNet
View on GitHub
☆16Aug 27, 2019Updated 6 years ago
KD-TAO / DyCoke
View on GitHub
[CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models
☆112Nov 22, 2025Updated 7 months ago
Yuancheng-Xu / GenARM
View on GitHub
Code for ICLR 2025 Paper "GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment"
☆24Feb 10, 2025Updated last year
wenge-research / CRE-SFT
View on GitHub
A supervised fine-tuning method for controllable reasoning length in large language models (一种通过有监督微调实现大语言模型思考长度可控的方法)
☆11May 8, 2025Updated last year
RiddleMa / cluster_zf
View on GitHub
方剂聚类
☆17Jul 18, 2018Updated 7 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
tongjingqi / AI-Can-Learn-Scientific-Taste
View on GitHub
We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervis…
☆418May 13, 2026Updated last month
RainBowLuoCS / MMEvol
View on GitHub
(ACL 2025) 🔥🔥🔥Code for "Empowering Multimodal Large Language Models with Evol-Instruct"
☆21May 15, 2025Updated last year
GeWu-Lab / Patch-Matters
View on GitHub
[CVPR2025] Code Release of Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
☆25Jun 17, 2025Updated last year
X1AOX1A / Word2World
View on GitHub
[ACL 2026 Oral] From Word to World: Can Large Language Models be Implicit Text-based World Models?
☆64Apr 13, 2026Updated 2 months ago
ai4protein / VenusX
View on GitHub
🧬 Large-scale protein functional residue or fragment prediction benchmark. (ICLR 2026)
☆24Apr 10, 2026Updated 2 months ago
AheadOFpotato / Awesome-LRM-Mechanisms
View on GitHub
Towards a Mechanistic Understanding of Large Reasoning Models: A Survey of Training, Inference, and Failures
☆34Jan 29, 2026Updated 5 months ago
ybwang119 / label_recovery
View on GitHub
[ICLR 2024] Towards Elminating Hard Label Constraints in Gradient Inverision Attacks
☆14Feb 6, 2024Updated 2 years ago