"Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning"
☆51Jan 28, 2026Updated last month
Alternatives and similar repositories for Omni-R1
Users that are interested in Omni-R1 are comparing it to the libraries listed below
Sorting:
- Whisper Speech Quality Assessment (WhiSQA)☆16Oct 14, 2025Updated 4 months ago
- ☆13Mar 11, 2025Updated 11 months ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆77Dec 3, 2025Updated 2 months ago
- The evaluation code for A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5☆50Jan 18, 2026Updated last month
- The demo page for ALMTokenizer☆59Apr 14, 2025Updated 10 months ago
- Residual Context Diffusion (RCD): Repurposing discarded signals as structured priors for high-performance reasoning in dLLMs.☆54Feb 11, 2026Updated 2 weeks ago
- TBD☆40Feb 3, 2026Updated 3 weeks ago
- Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedbac…☆34Jan 23, 2026Updated last month
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Jan 12, 2025Updated last year
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆61Jan 28, 2026Updated last month
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆22Aug 5, 2024Updated last year
- Official code of SenSE.☆74Oct 30, 2025Updated 3 months ago
- A neural speech codec based on discrete WavLM representations☆24Aug 28, 2024Updated last year
- AnyEnhance-based Baseline for the CCF-AATC 2025 Challenge Track 1☆44Dec 27, 2025Updated 2 months ago
- Collection of scripts from mHuBERT-147.☆32Nov 19, 2024Updated last year
- An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".☆26Nov 4, 2023Updated 2 years ago
- small audio language model for reasoning☆86Dec 4, 2025Updated 2 months ago
- This repository implement a novel zero-shot TTS framework, named Flamed-TTS, focusing on the efficient generation and dynamic pacing in …☆57Aug 9, 2025Updated 6 months ago
- code for COLING paper "A Hybrid Model of Classification and Generation for Spatial Relation Extraction"☆10Oct 20, 2022Updated 3 years ago
- Multi-step AI agents powered by Gemini 2.0 and the LangGraph framework. These agents orchestrate complex workflows and enhance their reas…☆10Dec 19, 2024Updated last year
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆63Nov 5, 2025Updated 3 months ago
- SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis☆147Jan 1, 2025Updated last year
- Software to enable data-rich collaboration from high-resolution display walls to your laptop☆16Feb 19, 2026Updated last week
- Martingale posterior neural networks for fast sequential decision making @ Neurips 2025☆23Nov 13, 2025Updated 3 months ago
- AI-native knowledge kernel for human/agent collaboration. Use it as a Knowledge Base, Wiki, Annotator, Research Tool, or Agentic Memory.☆29Updated this week
- SPAgent, a spatial intelligence agent designed to operate in the physical and spatial world.☆127Updated this week
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆45Feb 9, 2026Updated 2 weeks ago
- ☆40Apr 2, 2025Updated 10 months ago
- A virtual musical instrument built using Google MediaPipe.☆12Oct 10, 2022Updated 3 years ago
- Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer☆38Feb 17, 2025Updated last year
- The official implementation of COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence.☆28Dec 30, 2025Updated last month
- ☆24Dec 19, 2025Updated 2 months ago
- Benchmark evaluating ocean forecasting systems against reference datasets and observations.☆24Feb 20, 2026Updated last week
- ☆11Oct 31, 2024Updated last year
- ☆14Nov 19, 2024Updated last year
- ☆13Jul 3, 2024Updated last year
- Fast, free, easy, and object-agnostic video anonymization☆11Dec 12, 2020Updated 5 years ago
- ☆13Oct 21, 2024Updated last year
- ☆31Feb 3, 2026Updated 3 weeks ago