[NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
☆124Dec 3, 2025Updated 6 months ago
Alternatives and similar repositories for Omni-R1
Users that are interested in Omni-R1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICML2026] ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO☆81Apr 30, 2026Updated 2 months ago
- [NeurIPS'24] A Simple Image Segmentation Framework via In-Context Examples☆67Oct 29, 2024Updated last year
- [ICLR 2025 Spotlight] Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions☆45Mar 10, 2025Updated last year
- [3DV 2026] Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting☆162Dec 9, 2025Updated 6 months ago
- [NeurIPS 2025 Spotlight] A Generalist Diffusion Model for Vision Perception☆316Sep 21, 2025Updated 9 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆49May 7, 2026Updated last month
- [ICLR 2024] Official PyTorch/Diffusers implementation of "Object-aware Inversion and Reassembly for Image Editing"☆87Aug 23, 2024Updated last year
- [ICCV 2025] Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation☆92Sep 29, 2025Updated 9 months ago
- ☆12Mar 22, 2025Updated last year
- ☆18May 18, 2026Updated last month
- ☆14May 30, 2024Updated 2 years ago
- ☆15Apr 25, 2025Updated last year
- Universal Video Temporal Grounding with Generative Multi-modal Large Language Models☆57May 20, 2026Updated last month
- [2026 AAAI] Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation☆20Nov 8, 2025Updated 7 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆25Jul 20, 2025Updated 11 months ago
- Image Tokenizer Needs Post-Training☆24Oct 4, 2025Updated 8 months ago
- [ICLR 2024 Spotlight] The official repo for the paper "De novo Protein Design using Geometric Vector Field Networks".☆31Aug 23, 2024Updated last year
- SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting☆57Jul 21, 2025Updated 11 months ago
- Official implementation for "Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts"☆22Jun 28, 2025Updated last year
- ☆19Jul 22, 2025Updated 11 months ago
- [ICLR'25] Official PyTorch implementation of "Framer: Interactive Frame Interpolation".☆498Jan 9, 2025Updated last year
- Yet another Zhejiang University project reports template written in Typst☆121Nov 4, 2025Updated 7 months ago
- The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (ICML 2026)☆43Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official implementation of EgoThinker at NIPS 2025☆29Nov 25, 2025Updated 7 months ago
- NoughtQ's notebook☆126Jun 23, 2026Updated last week
- Matting by Generation☆36Aug 4, 2024Updated last year
- [CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice☆87Feb 27, 2026Updated 4 months ago
- Reasoning in Space via Grounding in the World (ICLR 2025)☆55Nov 3, 2025Updated 7 months ago
- [ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆41Jun 14, 2025Updated last year
- Empowering Data Driven insights through hands-on projects, SQL challenges and practical tools.☆24May 30, 2026Updated last month
- [ICCV2023] 🧊FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models☆131Aug 23, 2024Updated last year
- MCP server providing tools to create Ms Office documents like presentations, emails, spreadsheets and word docs (pptx, docx, eml, xlsx)☆30Jun 22, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ACM MM 2022 - PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding☆11Aug 12, 2022Updated 3 years ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆29Jan 23, 2024Updated 2 years ago
- The official repo of the paper titled DeH4R: A Decoupled and Hybrid Method for Road Network Graph Extraction.☆23May 25, 2026Updated last month
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆190Jun 5, 2025Updated last year
- ☆21Feb 29, 2024Updated 2 years ago
- [CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models☆93Apr 20, 2026Updated 2 months ago
- Structured Video Comprehension of Real-World Shorts☆238Sep 21, 2025Updated 9 months ago