☆107Oct 22, 2025Updated 7 months ago
Alternatives and similar repositories for Awesome-Agentic-RL-Papers
Users that are interested in Awesome-Agentic-RL-Papers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Survey.☆53Mar 8, 2026Updated 2 months ago
- Fast instruction tuning with Llama2☆11Apr 8, 2024Updated 2 years ago
- An open-ended, self-improving AI system that evolves its own source code using a local LLM. Built for autonomy, reflection, and code evol…☆24Jan 24, 2026Updated 4 months ago
- A proofreading tool using Google's N-gram corpus.☆12Sep 2, 2022Updated 3 years ago
- Official PyTorch implementation for the ICML 2023 paper "Out-of-Distribution Generalization of Federated Learning via Implicit Invariant …☆14Oct 31, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- macrogpt大模型全量预训练(1b3,32层), 多卡deepspeed/单卡adafactor☆15Nov 30, 2023Updated 2 years ago
- Official repository for CoTran: An LLM-based code translator for whole-program translation, fine-tuned using feedback from compiler and s…☆15Nov 6, 2024Updated last year
- STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models☆48Apr 23, 2026Updated last month
- Synthesizes efficient Z3 strategies tailored to your problem set! Repo for the IJCAI'24 paper: Layered and Staged Monte Carlo Tree Search…☆25Updated this week
- The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1☆68May 8, 2026Updated 2 weeks ago
- Towards Meta-Pruning via Optimal Transport, ICLR 2024 (Spotlight)☆18Dec 5, 2024Updated last year
- 🎉 TrustJudge is accepted to ICLR 2026!☆46Sep 27, 2025Updated 7 months ago
- [NeurIPS 2025] Reasoning MLLM, Share-GRPO, advantage vanishing, sparse reward☆36Sep 19, 2025Updated 8 months ago
- 身份证翻译模板☆11May 25, 2020Updated 6 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Paper: “MEMRL: SELF-EVOLVING AGENTS VIA RUNTIME REINFORCEMENT LEARNING ON EPISODIC MEMORY” Open-Source Code☆114May 2, 2026Updated 3 weeks ago
- crawler youtube trends use selenium on python☆18Apr 16, 2016Updated 10 years ago
- ☆10Oct 20, 2023Updated 2 years ago
- Official implementation for paper "How Far Are We from Genuinely Useful Deep Research Agents?"☆65Dec 10, 2025Updated 5 months ago
- A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning.☆20Aug 23, 2025Updated 9 months ago
- nodeppt-template-default☆12Jan 31, 2019Updated 7 years ago
- Towards a Mechanistic Understanding of Large Reasoning Models: A Survey of Training, Inference, and Failures☆33Jan 29, 2026Updated 3 months ago
- Focused Papers, Delivered Simply :)☆55Dec 25, 2025Updated 5 months ago
- [ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better☆16Feb 15, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Adaptive gradient sparsification for efficient federated learning: an online learning approach☆18Oct 31, 2020Updated 5 years ago
- Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation (CVPR24)☆11Jun 16, 2024Updated last year
- Explanation Optimization☆13Oct 16, 2020Updated 5 years ago
- [ML4H'25] MedVLThinker: Simple Baselines for Multimodal Medical Reasoning☆57Dec 21, 2025Updated 5 months ago
- ☆36Oct 22, 2025Updated 7 months ago
- [R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"☆129Oct 27, 2025Updated 6 months ago
- An Efficient Dataset Condensation Plugin and Its Application to Continual Learning. NeurIPS, 2023.☆12Nov 29, 2023Updated 2 years ago
- Codes of Paper "Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding"☆20Aug 30, 2024Updated last year
- [ICML 2024 spotlight] This repository contains the implementation details for the paper "Locally Estimated Global Perturbations are Bette…☆25Jul 29, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ACL 2026☆27Nov 19, 2025Updated 6 months ago
- An implementation of DenoiseNet https://arxiv.org/pdf/1701.01687.pdf☆10Nov 22, 2022Updated 3 years ago
- Implementation for FedConv: A Learning-on-Model Paradigm for Heterogeneous Federated Clients☆28Aug 21, 2024Updated last year
- A pytorch re-implementation for paper "Towards Deep Learning Models Resistant to Adversarial Attacks"☆21May 21, 2019Updated 7 years ago
- Contextual Vision Transformers for Robust Representation Learning☆14Oct 19, 2023Updated 2 years ago
- [ICLR 2024] "Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality" by Xuxi Chen*, Yu Yang*, Zhangyang Wang, Baha…☆15May 18, 2024Updated 2 years ago
- ☆27Mar 17, 2025Updated last year