[NeurIPS 2025] Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
☆120Dec 3, 2025Updated 4 months ago
Alternatives and similar repositories for Omni-R1
Users that are interested in Omni-R1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO☆80Mar 25, 2026Updated last month
- [NeurIPS'24] A Simple Image Segmentation Framework via In-Context Examples☆66Oct 29, 2024Updated last year
- [ICLR 2025 Spotlight] Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions☆43Mar 10, 2025Updated last year
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆47Updated this week
- [ICML 2024] Floating Anchor Diffusion Model for Multi-motif Scaffolding☆34Aug 23, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆13May 17, 2025Updated 11 months ago
- [ICLR 2024] Official PyTorch/Diffusers implementation of "Object-aware Inversion and Reassembly for Image Editing"☆87Aug 23, 2024Updated last year
- ☆12Mar 22, 2025Updated last year
- ☆18Apr 4, 2025Updated last year
- ☆15May 30, 2024Updated last year
- ☆14Apr 25, 2025Updated last year
- Universal Video Temporal Grounding with Generative Multi-modal Large Language Models☆52Mar 20, 2026Updated last month
- [2026 AAAI] Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation☆20Nov 8, 2025Updated 5 months ago
- Image Tokenizer Needs Post-Training☆24Oct 4, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ICLR 2024 Spotlight] The official repo for the paper "De novo Protein Design using Geometric Vector Field Networks".☆31Aug 23, 2024Updated last year
- SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting☆57Jul 21, 2025Updated 9 months ago
- Official implementation for "Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts"☆22Jun 28, 2025Updated 10 months ago
- Official codes for the paper "GARDO: Reinforcing Diffusion Models without Reward Hacking"☆57Feb 2, 2026Updated 2 months ago
- [ICLR'25] Official PyTorch implementation of "Framer: Interactive Frame Interpolation".☆501Jan 9, 2025Updated last year
- The official implementation of "PixelThink: Towards Efficient Chain-of-Pixel Reasoning" (arXiv 2025)☆41May 30, 2025Updated 11 months ago
- [ICLR2025] GenPercept: Diffusion Models Trained with Large Data Are Transferable Visual Models☆221Jan 24, 2025Updated last year
- Official implementation of EgoThinker at NIPS 2025☆26Nov 25, 2025Updated 5 months ago
- NoughtQ's notebook☆117Updated this week
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- [ICLR'25] MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences☆323Aug 10, 2024Updated last year
- [ICCV 2025] Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation☆92Sep 29, 2025Updated 7 months ago
- Reasoning in Space via Grounding in the World (ICLR 2025)☆52Nov 3, 2025Updated 5 months ago
- EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning [🔥The Exploration of R1 for General Audio-Vi…☆78May 18, 2025Updated 11 months ago
- [ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆41Jun 14, 2025Updated 10 months ago
- Empowering Data Driven insights through hands-on projects, SQL challenges and practical tools.☆24Mar 7, 2026Updated last month
- MCP server providing tools to create Ms Office documents like presentations, emails, spreadsheets and word docs (pptx, docx, eml, xlsx)☆25Apr 11, 2026Updated 2 weeks ago
- ACM MM 2022 - PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding☆11Aug 12, 2022Updated 3 years ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆29Jan 23, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The official repo of the paper titled DeH4R: A Decoupled and Hybrid Method for Road Network Graph Extraction.☆23Apr 10, 2026Updated 3 weeks ago
- [CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models☆74Apr 20, 2026Updated last week
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆187Jun 5, 2025Updated 10 months ago
- ☆21Feb 29, 2024Updated 2 years ago
- Structured Video Comprehension of Real-World Shorts☆237Sep 21, 2025Updated 7 months ago
- [CVPR 2024] Official PyTorch implementation of FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition☆176Sep 1, 2025Updated 8 months ago
- [EMNLP 2024 Industry track] MERLIN : Multimodal Embedding Refinement via LLM-based Iterative Navigation for Text-Video Retrieval-Rerank P…☆14Mar 4, 2025Updated last year