MrlX: A Multi-Agent Reinforcement Learning Framework
☆193Jan 19, 2026Updated last month
Alternatives and similar repositories for MrlX
Users that are interested in MrlX are comparing it to the libraries listed below
Sorting:
- LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding☆35Jan 16, 2026Updated last month
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.☆19Oct 14, 2024Updated last year
- OmniGAIA: Towards Native Omni-Modal AI Agents☆46Feb 28, 2026Updated last week
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆13Jan 16, 2026Updated last month
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆86May 21, 2025Updated 9 months ago
- The code for paper: Hierarchical Document Refinement for Long-context Retrieval-augmented Generation [ACL2025 Oral]☆42Aug 25, 2025Updated 6 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Dec 30, 2025Updated 2 months ago
- A comprehensive benchmark for evaluating deep research agents on academic survey tasks☆50Sep 4, 2025Updated 6 months ago
- An official PyTorch implementation of "Certifiably Robust Graph Contrastive Learning" (NeurIPS 2023)☆11Jan 22, 2024Updated 2 years ago
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated last month
- ☆11Mar 13, 2023Updated 2 years ago
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 7 months ago
- ☆26Jul 29, 2025Updated 7 months ago
- ☆14Dec 18, 2024Updated last year
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)☆39Sep 8, 2025Updated 5 months ago
- Dream-VL and Dream-VLA, a diffusion VLM and a diffusion VLA.☆108Jan 14, 2026Updated last month
- ☆34Jan 25, 2026Updated last month
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Jan 26, 2025Updated last year
- ☆17Aug 17, 2024Updated last year
- Plancraft is a minecraft environment and agent suite to test planning capabilities in LLMs☆26Nov 7, 2025Updated 4 months ago
- Official implementation of the paper: "A deeper look at depth pruning of LLMs"☆15Jul 24, 2024Updated last year
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆25Oct 7, 2025Updated 5 months ago
- A lightweight, reproducible toolkit for LLM-based query reformulation.☆29Jan 3, 2026Updated 2 months ago
- MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs☆39Feb 19, 2026Updated 2 weeks ago
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 5 months ago
- ☆77Nov 6, 2025Updated 4 months ago
- This repository contains the code for the paper “Neuro-Symbolic Query Compiler”, accepted to the Findings of ACL 2025.☆16Oct 20, 2025Updated 4 months ago
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Oct 27, 2024Updated last year
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆32Feb 26, 2026Updated last week
- Extending context length of visual language models☆12Dec 18, 2024Updated last year
- ☆28Jun 5, 2025Updated 9 months ago
- The benchmark and datasets of the ICML 2024 paper "VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual C…☆17May 27, 2024Updated last year
- 6th Place Solution for the Google - Isolated Sign Language Recognition Kaggle Competition☆13May 4, 2023Updated 2 years ago
- A curated list of cutting-edge research papers and resources on Long Chain-of-Thought (CoT) Reasoning with Tools.☆46Dec 17, 2025Updated 2 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆21Dec 22, 2025Updated 2 months ago
- Code, benchmark and environment for "OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows"☆38Nov 10, 2025Updated 3 months ago
- ☆13May 26, 2022Updated 3 years ago
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆18Oct 1, 2024Updated last year