Official code of "RoboOmni: Proactive Robot Manipulation in Omni-modal Context"
☆82Nov 17, 2025Updated 3 months ago
Alternatives and similar repositories for RoboOmni
Users that are interested in RoboOmni are comparing it to the libraries listed below
Sorting:
- ☆12Jun 11, 2025Updated 8 months ago
- multi-agent crafter for cooperative tasks☆13Aug 2, 2025Updated 7 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated 11 months ago
- 用Kinova Gen3实机实现Rekep☆11Mar 18, 2025Updated 11 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆21Dec 22, 2025Updated 2 months ago
- The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".☆18Apr 25, 2025Updated 10 months ago
- Personal Experiment around ReKep☆17Feb 3, 2025Updated last year
- Manipulate-Anything: Automating Real-World Robots using Vision-Language Models [CoRL 2024]☆54Apr 3, 2025Updated 11 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- ☆33Jun 3, 2025Updated 9 months ago
- Vocabulary Parallelism☆25Mar 10, 2025Updated 11 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Jul 24, 2025Updated 7 months ago
- The official implementation of Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight☆82Jan 16, 2026Updated last month
- [CVPRW 2024] LaPA: Latent Prompt Assist Model For Medical Visual Question Answering☆25Apr 24, 2025Updated 10 months ago
- [NeurIPS2024] Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs☆23Oct 15, 2024Updated last year
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆59Jan 5, 2026Updated last month
- a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.☆37Apr 7, 2025Updated 10 months ago
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆58Nov 16, 2024Updated last year
- Source code of paper: A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models. (ICML 2025)☆36Apr 2, 2025Updated 11 months ago
- ☆40Jul 15, 2025Updated 7 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Feb 24, 2026Updated last week
- OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models☆55Feb 1, 2026Updated last month
- OpenS2S : Advancing Fully Open-Source End-to-End Empathetic Large Speech Language Model☆112Jul 17, 2025Updated 7 months ago
- InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy☆377Feb 11, 2026Updated 2 weeks ago
- ☆67Dec 7, 2025Updated 2 months ago
- StereoVLA is powered by stereo vision and supports flexible deployment with high tolerance to camera pose variations.☆52Jan 12, 2026Updated last month
- Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding"☆57Jan 23, 2026Updated last month
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- ☆18Jun 10, 2025Updated 8 months ago
- [NeurIPS 2025] VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning☆74Dec 14, 2025Updated 2 months ago
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆41Jan 26, 2026Updated last month
- [ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges☆83Feb 27, 2025Updated last year
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆30Feb 24, 2026Updated last week
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- 哈尔滨工业大学2023春季学期编译系统课程实验、习题、课件以及期末复习材料☆11Jul 30, 2023Updated 2 years ago
- This is the official code of the paper: Facial Micro-motion-aware Mixup for Micro-expression Recognition☆13Apr 7, 2024Updated last year
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 5 months ago