AQ-MedAI / MrlXView external linksLinks
MrlX: A Multi-Agent Reinforcement Learning Framework
☆190Jan 19, 2026Updated 3 weeks ago
Alternatives and similar repositories for MrlX
Users that are interested in MrlX are comparing it to the libraries listed below
Sorting:
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding☆34Jan 16, 2026Updated 3 weeks ago
- OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.☆19Oct 14, 2024Updated last year
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆13Jan 16, 2026Updated 3 weeks ago
- The code for paper: Hierarchical Document Refinement for Long-context Retrieval-augmented Generation [ACL2025 Oral]☆41Aug 25, 2025Updated 5 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Dec 30, 2025Updated last month
- A comprehensive benchmark for evaluating deep research agents on academic survey tasks☆49Sep 4, 2025Updated 5 months ago
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 6 months ago
- ☆23Jul 29, 2025Updated 6 months ago
- ☆15Jan 12, 2026Updated last month
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated 2 weeks ago
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)☆39Sep 8, 2025Updated 5 months ago
- ☆32Jan 25, 2026Updated 2 weeks ago
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Jan 26, 2025Updated last year
- ☆17Aug 17, 2024Updated last year
- MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs☆35Jan 18, 2026Updated 3 weeks ago
- Official implementation of the paper: "A deeper look at depth pruning of LLMs"☆15Jul 24, 2024Updated last year
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 4 months ago
- Plancraft is a minecraft environment and agent suite to test planning capabilities in LLMs☆26Nov 7, 2025Updated 3 months ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆24Oct 7, 2025Updated 4 months ago
- A lightweight, reproducible toolkit for LLM-based query reformulation.☆29Jan 3, 2026Updated last month
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆29Sep 19, 2025Updated 4 months ago
- ☆77Nov 6, 2025Updated 3 months ago
- 6th Place Solution for the Google - Isolated Sign Language Recognition Kaggle Competition☆13May 4, 2023Updated 2 years ago
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual C…☆17May 27, 2024Updated last year
- ☆26Jun 5, 2025Updated 8 months ago
- This repository contains the code for the paper “Neuro-Symbolic Query Compiler”, accepted to the Findings of ACL 2025.☆16Oct 20, 2025Updated 3 months ago
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Oct 27, 2024Updated last year
- Wave: Python Domain-Specific Language for High Performance Machine Learning☆44Updated this week
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆18Oct 1, 2024Updated last year
- A curated list of cutting-edge research papers and resources on Long Chain-of-Thought (CoT) Reasoning with Tools.☆45Dec 17, 2025Updated last month
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆21Dec 22, 2025Updated last month
- ☆51Apr 30, 2025Updated 9 months ago
- Official repository for ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use☆28Nov 4, 2025Updated 3 months ago
- [NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"☆30Jan 12, 2026Updated last month
- Control LLM☆22Apr 6, 2025Updated 10 months ago
- ☆38Oct 12, 2025Updated 4 months ago
- Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedbac…☆34Jan 23, 2026Updated 3 weeks ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆70Jul 13, 2025Updated 7 months ago