The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight
☆88Jan 16, 2026Updated 2 months ago
Alternatives and similar repositories for Mantis
Users that are interested in Mantis are comparing it to the libraries listed below
Sorting:
- LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding☆36Jan 16, 2026Updated 2 months ago
- ☆39May 20, 2025Updated 10 months ago
- The repository provides code for EgoMAN model and dataset creation scripts.☆28Dec 31, 2025Updated 2 months ago
- Towards Generalizable Robotic Manipulation in Dynamic Environments☆34Updated this week
- Code for orthogonal neural operator☆18Oct 15, 2023Updated 2 years ago
- Official Implementation of Paper: WMPO: World Model-based Policy Optimization for Vision-Language-Action Models☆184Jan 4, 2026Updated 2 months ago
- EgoDex: Learning Dexterous Manipulation from Large-Scale Egocentric Video☆178Aug 20, 2025Updated 7 months ago
- [Official] [IROS 2024] A goal-oriented planning to lift VLN performance for Closed-Loop Navigation: Simple, Yet Effective☆28Apr 4, 2024Updated last year
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆32Feb 26, 2026Updated 3 weeks ago
- GSVC: Efficient Video Representation and Compression Through 2D Gaussian Splatting (NOSSDAV 2025)☆29Aug 15, 2025Updated 7 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆32Oct 30, 2025Updated 4 months ago
- [ICCV25] TACA: Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers☆41Jul 23, 2025Updated 8 months ago
- [ICLR 2026] InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation☆108Jan 27, 2026Updated last month
- [AAAI 2026] Official implementation of paper "UrbanNav: Learning Language-Guided Embodied Urban Navigation from Web-Scale Human Trajector…☆46Jan 30, 2026Updated last month
- ☆43Jan 30, 2026Updated last month
- ☆35Nov 17, 2025Updated 4 months ago
- Code for IROS 2024 paper "AutoNeRF: Training Implicit Scene Representations with Autonomous Agents"☆17Oct 24, 2024Updated last year
- Test Realtime FIR/IIR Filter using FMAC (Filter Math ACCcelerator). The FMAC unit is built around a fixed point multiplier and accumulato…☆13Nov 10, 2021Updated 4 years ago
- ☆14Oct 11, 2023Updated 2 years ago
- This repository contains the code and data for the paper "Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents wit…☆58Mar 14, 2026Updated last week
- Hands-On Image Processing with Python, Second Edition, Published by Packt☆27Updated this week
- 🚗🗣️📡🗾🏁 A framework for navigation tasks that can build the 3D scene graph in real-time and utilize large language model (LLM) to gui…☆24Oct 14, 2024Updated last year
- Test-time Scaling for VAR models☆31Sep 19, 2025Updated 6 months ago
- ☆35Jan 25, 2026Updated last month
- ☆20May 6, 2022Updated 3 years ago
- Adding confidence to the SPIN mesh.☆13Jun 25, 2023Updated 2 years ago
- ☆31Sep 12, 2025Updated 6 months ago
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Dec 30, 2025Updated 2 months ago
- A MCP Task Server☆11Mar 7, 2025Updated last year
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 5 months ago
- ☆33Jul 15, 2025Updated 8 months ago
- Official implementation of Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions (Ne…☆54Dec 20, 2024Updated last year
- ☆10Feb 3, 2026Updated last month
- ☆12Jun 11, 2025Updated 9 months ago
- Source code for [ECCV2024]O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation☆23Mar 23, 2025Updated last year
- The collections of MOE (Mixture Of Expert) papers, code and tools, etc.☆12Mar 15, 2024Updated 2 years ago
- H-RDT: Human Manipulation Enhanced Bimanual Robotic Manipulation☆127Dec 21, 2025Updated 3 months ago
- [AAMAS 2026] Don’t Blind Your VLA: Aligning Visual Representations for OOD Generalization. https://blind-vla-paper.github.io☆61Jan 25, 2026Updated last month
- Official code of "HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud …☆25Oct 31, 2025Updated 4 months ago