☆107Oct 22, 2025Updated 5 months ago
Alternatives and similar repositories for Awesome-Agentic-RL-Papers
Users that are interested in Awesome-Agentic-RL-Papers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆24Nov 20, 2025Updated 4 months ago
- ☆62May 21, 2025Updated 10 months ago
- A proofreading tool using Google's N-gram corpus.☆12Sep 2, 2022Updated 3 years ago
- Official implementation of the paper “Reconsidering Overthinking: Penalizing Internal and External Redundancy in CoT Reasoning”☆20Aug 20, 2025Updated 7 months ago
- macrogpt大模型全量预训练(1b3,32层), 多卡deepspeed/单卡adafactor☆15Nov 30, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆14Oct 19, 2025Updated 5 months ago
- TBD☆53Mar 13, 2026Updated last month
- ☆11Apr 12, 2024Updated 2 years ago
- Towards Meta-Pruning via Optimal Transport, ICLR 2024 (Spotlight)☆18Dec 5, 2024Updated last year
- Official Repository for ICML 2024 Paper "OT-CLIP: Understanding and Generalizing CLIP via Optimal Transport"☆23Dec 4, 2025Updated 4 months ago
- ☆33Jun 18, 2025Updated 9 months ago
- 🎉 TrustJudge is accepted to ICLR 2026!☆46Sep 27, 2025Updated 6 months ago
- [NeurIPS 2025] Reasoning MLLM, Share-GRPO, advantage vanishing, sparse reward☆36Sep 19, 2025Updated 6 months ago
- ☆34Oct 24, 2025Updated 5 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Collection of awesome Continual Test-Time Adaptation methods☆24Jun 4, 2024Updated last year
- ☆14Apr 6, 2025Updated last year
- Code of paper: xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking"☆18Apr 3, 2026Updated last week
- sequential learning in orthogonal subspaces☆14Nov 20, 2020Updated 5 years ago
- nodeppt-template-default☆12Jan 31, 2019Updated 7 years ago
- KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding☆66Apr 5, 2026Updated last week
- Continuous descriptor-based control for deep audio synthesis☆23Aug 4, 2023Updated 2 years ago
- ☆17Nov 10, 2012Updated 13 years ago
- Towards a Mechanistic Understanding of Large Reasoning Models: A Survey of Training, Inference, and Failures☆33Jan 29, 2026Updated 2 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better☆16Feb 15, 2025Updated last year
- ☆32Oct 22, 2025Updated 5 months ago
- Official Implementation of "Simulating Environments with Reasoning Models for Agent Training"☆62Feb 18, 2026Updated last month
- Explanation Optimization☆13Oct 16, 2020Updated 5 years ago
- ACL 2026☆26Nov 19, 2025Updated 4 months ago
- [R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"☆127Oct 27, 2025Updated 5 months ago
- An Efficient Dataset Condensation Plugin and Its Application to Continual Learning. NeurIPS, 2023.☆12Nov 29, 2023Updated 2 years ago
- [ICML 2024 spotlight] This repository contains the implementation details for the paper "Locally Estimated Global Perturbations are Bette…☆25Jul 29, 2024Updated last year
- An implementation of DenoiseNet https://arxiv.org/pdf/1701.01687.pdf☆10Nov 22, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Implementation for FedConv: A Learning-on-Model Paradigm for Heterogeneous Federated Clients☆27Aug 21, 2024Updated last year
- A curated list of autonomous research systems and tools.☆101Apr 3, 2026Updated last week
- This python script can help you to detect what object is in moving.☆12Nov 28, 2018Updated 7 years ago
- SGQuant: Squeezing the Last Bit on Graph Neural Networks with Specialized Quantization☆11Aug 12, 2020Updated 5 years ago
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"☆17Jun 9, 2025Updated 10 months ago
- [arXiv 2025] Materials Generation in the Era of Artificial Intelligence: A Comprehensive Survey☆44Nov 6, 2025Updated 5 months ago
- [ICLR 2024] "Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality" by Xuxi Chen*, Yu Yang*, Zhangyang Wang, Baha…☆15May 18, 2024Updated last year