[COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models
☆141Dec 17, 2025Updated 2 months ago
Alternatives and similar repositories for APR
Users that are interested in APR are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆62Feb 21, 2025Updated last year
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆26Oct 14, 2025Updated 4 months ago
- ☆32Oct 13, 2025Updated 4 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Feb 4, 2026Updated 3 weeks ago
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆632Jul 29, 2025Updated 7 months ago
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]☆64Oct 2, 2025Updated 4 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆262May 5, 2025Updated 9 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆51Jul 15, 2025Updated 7 months ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆31Updated this week
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- [arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies☆59Feb 6, 2026Updated 3 weeks ago
- PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)☆30Jun 14, 2024Updated last year
- [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference☆372Jul 10, 2025Updated 7 months ago
- Official Implementation for the paper "Integrative Decoding: Improving Factuality via Implicit Self-consistency"☆32Apr 12, 2025Updated 10 months ago
- Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models☆45Sep 19, 2025Updated 5 months ago
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 10 months ago
- Codes for Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback (ACL 2024 Findings)☆16Jul 2, 2024Updated last year
- Code for the paper "Self-Detoxifying Language Models via Toxification Reversal" (EMNLP 2023)☆18Oct 17, 2023Updated 2 years ago
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆64Feb 19, 2026Updated last week
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆62Oct 24, 2025Updated 4 months ago
- Official implementation of TBA for async LLM post-training.☆29Nov 5, 2025Updated 3 months ago
- [NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example☆411Nov 21, 2025Updated 3 months ago
- SkyRL: A Modular Full-stack RL Library for LLMs☆1,628Updated this week
- Preference Learning for LLaVA☆59Nov 9, 2024Updated last year
- ☆19Mar 10, 2025Updated 11 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago
- Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"☆23Mar 18, 2025Updated 11 months ago
- ☆37Jan 26, 2024Updated 2 years ago
- [ICML 2025] SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity☆71Jul 5, 2025Updated 7 months ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Sep 26, 2024Updated last year
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning☆405Dec 15, 2024Updated last year
- [ICLR 2026] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"☆46Aug 16, 2025Updated 6 months ago
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆2,512Feb 18, 2026Updated last week
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆86Oct 26, 2025Updated 4 months ago
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆68Dec 8, 2025Updated 2 months ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 6 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆221Nov 27, 2025Updated 3 months ago
- Indexing framework designed for the automated creation of structured knowledge bases in Azure AI Search☆14Jun 18, 2025Updated 8 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆81Dec 25, 2025Updated 2 months ago