ModalityDance/Omni-R1

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ModalityDance/Omni-R1)

ModalityDance / Omni-R1

[ACL 2026 Findings] "Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"

☆63

Alternatives and similar repositories for Omni-R1

Users that are interested in Omni-R1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ModalityDance / AR-Omni
View on GitHub
"AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation"
☆43May 26, 2026Updated last month
ModalityDance / MRM
View on GitHub
[SIGIR 2026] "One Adapts to Any: Meta Reward Modeling for Personalized LLM Alignment"
☆15Apr 21, 2026Updated 3 months ago
ModalityDance / LatentTTS
View on GitHub
"Parallel Test-Time Scaling for Latent Reasoning Models"
☆22Apr 12, 2026Updated 3 months ago
ModalityDance / Awesome-Agent-as-a-Judge
View on GitHub
"A Survey on Agent-as-a-Judge"
☆138May 11, 2026Updated 2 months ago
multimodal-reasoning-lab / Bagel-Zebra-CoT
View on GitHub
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
☆137Jan 30, 2026Updated 5 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
yangdongchao / ALMTokenizer2
View on GitHub
The open source code of ALMTokenizer2: Towards Low bit-rate and Semantic-rich Audio Tokenizer with Flow-based Scalar Diffusion Transforme…
☆45Sep 5, 2025Updated 10 months ago
yangdongchao / Omni-AutoThink
View on GitHub
Adaptive Multimodal Reasoning via Reinforcement Learning
☆23Jan 11, 2026Updated 6 months ago
NOVAglow646 / Monet
View on GitHub
[CVPR 2026] Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"
☆207Mar 19, 2026Updated 4 months ago
ShareLab-SII / CaTok
View on GitHub
[CVPR-26] Official repository of "CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization"
☆19Mar 9, 2026Updated 4 months ago
liyongqi67 / LTRGR
View on GitHub
☆21Aug 9, 2024Updated last year
DualityRL / multi-attempt
View on GitHub
☆19Mar 10, 2025Updated last year
snap-research / VIMI
View on GitHub
☆13Jul 10, 2024Updated 2 years ago
channel-io / ch-tts-llasa-rl-grpo
View on GitHub
☆50Apr 20, 2026Updated 3 months ago
VincentLeebang / lvr
View on GitHub
Official codebase for the paper Latent Visual Reasoning
☆169Oct 22, 2025Updated 8 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zsgvivo / VideoZoomer
View on GitHub
☆34Feb 12, 2026Updated 5 months ago
Lux0926 / ASPRM
View on GitHub
AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence
☆10Mar 2, 2025Updated last year
lavendery / AudioComposer
View on GitHub
☆27Sep 10, 2025Updated 10 months ago
lonzi / mrflow_dpo
View on GitHub
☆22Jan 3, 2026Updated 6 months ago
hemingkx / SWIFT
View on GitHub
[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
☆70Feb 21, 2025Updated last year
bebr2 / RACE
View on GitHub
Code for RACE.
☆15Nov 12, 2025Updated 8 months ago
SKYLENAGE-AI / DeepVision-103K
View on GitHub
Codebase for DeepVision-103K
☆22Feb 21, 2026Updated 5 months ago
thuml / Reasoning-Visual-World
View on GitHub
Official repository for "Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models", https://arxiv.org/abs/2601.1983…
☆100Mar 9, 2026Updated 4 months ago
hemingkx / Whisper
View on GitHub
[ACL 2026] Enabling Efficient Reasoning in LLMs via Black-box Persuasive Prompting
☆22Jan 9, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hwanyu112 / Latent-Sketchpad
View on GitHub
☆73Feb 1, 2026Updated 5 months ago
pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
liyongqi67 / GCoQA
View on GitHub
☆18Jun 24, 2025Updated last year
ituvisionlab / EdVAE
View on GitHub
Official PyTorch implementation of "EdVAE: Mitigating Codebook Collapse with Evidential Discrete Variational Autoencoders"
☆14Sep 20, 2024Updated last year
yangdongchao / ALMTokenizer
View on GitHub
The demo page for ALMTokenizer
☆59Apr 14, 2025Updated last year
lavendery / UUG
View on GitHub
☆21Sep 14, 2025Updated 10 months ago
sen-ye / R3
View on GitHub
[ICLR26] Understanding VS. Generation: Navigating Optimization Dilemma in Multimodal Models
☆25May 6, 2026Updated 2 months ago
ThinkMorph / ThinkMorph
View on GitHub
[ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"
☆190May 1, 2026Updated 2 months ago
hemingkx / SpecDec
View on GitHub
Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)
☆47Dec 9, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
JianyuanZhong / StableDRL
View on GitHub
☆15Updated this week
flamed-tts / Flamed-TTS
View on GitHub
This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …
☆57Aug 9, 2025Updated 11 months ago
InternLM / EndoCoT
View on GitHub
[ECCV 2026] An official implementation of "EndoCoT". Scaling endogenous Chain-of-Thought (CoT) reasoning in diffusion models for complex …
☆43Jun 26, 2026Updated 3 weeks ago
OpenGVLab / PVC
View on GitHub
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆54Jun 12, 2025Updated last year
AbrahamSanders / codec-bpe
View on GitHub
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
☆76Dec 3, 2025Updated 7 months ago
shiwk24 / MathCanvas
View on GitHub
This is the official repository for the paper "MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning"
☆79Apr 14, 2026Updated 3 months ago
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year