"Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"
☆62Jan 28, 2026Updated 2 months ago
Alternatives and similar repositories for Omni-R1
Users that are interested in Omni-R1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Mar 11, 2025Updated last year
- TBD☆53Mar 13, 2026Updated 3 weeks ago
- Official implementation of FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment☆36Mar 24, 2026Updated 2 weeks ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆76Dec 3, 2025Updated 4 months ago
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Models☆28Mar 18, 2026Updated 3 weeks ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- The demo page for ALMTokenizer☆59Apr 14, 2025Updated 11 months ago
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Jan 12, 2025Updated last year
- The evaluation code for A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5☆53Jan 18, 2026Updated 2 months ago
- Residual Context Diffusion (RCD): Repurposing discarded signals as structured priors for high-performance reasoning in dLLMs.☆57Mar 12, 2026Updated 3 weeks ago
- [CVPR 2026] Official repo for "EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation"☆53Mar 13, 2026Updated 3 weeks ago
- Open Ended Medical Reinforcement Learning☆42Mar 15, 2026Updated 3 weeks ago
- Whisper Speech Quality Assessment (WhiSQA)☆16Updated this week
- [CVPR 2026] Official code of "EmbodiedSplat: Online Feed-Forward Semantic 3DGS for Open-Vocabulary 3D Scene Understanding"☆65Mar 7, 2026Updated last month
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams☆76Mar 15, 2026Updated 3 weeks ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆64Nov 5, 2025Updated 5 months ago
- daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently☆37Feb 4, 2026Updated 2 months ago
- ComfyUI version of WithAnyone☆24Dec 18, 2025Updated 3 months ago
- small audio language model for reasoning☆85Dec 4, 2025Updated 4 months ago
- ☆40Mar 23, 2026Updated 2 weeks ago
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT☆132Jan 30, 2026Updated 2 months ago
- "AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation"☆42Jan 27, 2026Updated 2 months ago
- Spatial Aptitude Training for Multimodal Langauge Models☆27Feb 8, 2026Updated 2 months ago
- The official repository of paper "Evaluating MLLMs with Multimodal Multi-image Reasoning Benchmark"☆20Jun 20, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆30Feb 24, 2026Updated last month
- AnyEnhance-based Baseline for the CCF-AATC 2025 Challenge Track 1☆50Dec 27, 2025Updated 3 months ago
- Collection of scripts from mHuBERT-147.☆34Nov 19, 2024Updated last year
- ☆21Aug 9, 2024Updated last year
- ☆11Oct 31, 2024Updated last year
- SPAgent, a foundation agent for understanding, reasoning over, and operating within the physical and spatial world.☆165Updated this week
- A neural speech codec based on discrete WavLM representations☆26Aug 28, 2024Updated last year
- Official code of SenSE.☆77Oct 30, 2025Updated 5 months ago
- Code for CVPR 2024 Oral "Neural Lineage"☆17Jun 18, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models☆36Mar 29, 2026Updated last week
- Open, royalty free, lyrics2song / song generation data collection / cleaning pipeline.☆17May 9, 2025Updated 11 months ago
- Green-VLA: Staged Vision-Language-Action Model for Generalist Robots☆113Mar 5, 2026Updated last month
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆65Feb 21, 2025Updated last year
- An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".☆26Nov 4, 2023Updated 2 years ago
- Official repository for “Reasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Space”☆18Jan 27, 2026Updated 2 months ago
- Testing sets for semanticVAD☆20Feb 18, 2025Updated last year