This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities
☆40Apr 28, 2026Updated last week
Alternatives and similar repositories for Daily-Omni
Users that are interested in Daily-Omni are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The Source Code for OmniVideoBench @ICLR 2026☆72Feb 12, 2026Updated 2 months ago
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆47Updated this week
- [ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection☆13Apr 12, 2024Updated 2 years ago
- https://avocado-captioner.github.io/☆34Oct 16, 2025Updated 6 months ago
- Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"☆24Nov 1, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated 2 months ago
- ☆29Sep 4, 2025Updated 8 months ago
- ☆19Jan 26, 2025Updated last year
- [ICCV 2025] Official PyTorch Code for "Describe, Adapt and Combine: Empowering CLIP Encoders for Open-set 3D Object Retrieval"☆17Aug 23, 2025Updated 8 months ago
- 南京大学计算机网络实验2022秋☆22Jul 16, 2023Updated 2 years ago
- [ICML 2025 Oral] This is the official repository of the paper "What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensi…☆22Jun 12, 2025Updated 10 months ago
- ☆18Jul 25, 2025Updated 9 months ago
- Chinese Grammatical Error Diagnosis☆11Oct 26, 2021Updated 4 years ago
- Omni Model Benchmark with high quality and diversity, which reveals the Compositional Law. We’re now focused on Chinese scenarios — and a…☆78Jan 12, 2026Updated 3 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- NJU Computer Network Lab☆12Jul 2, 2021Updated 4 years ago
- [ICLR 2026] Official code repository for "⚡️VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration"☆41Feb 24, 2026Updated 2 months ago
- 南京大学 NJU 计算机网络 计网 LAB☆11Jun 21, 2021Updated 4 years ago
- Python MusicXML parser to load mxml files as a pianoroll representation. The pianoroll i☆24May 13, 2022Updated 3 years ago
- OpenAI compatible API servers for the Qwen3 TTS models☆82Mar 6, 2026Updated 2 months ago
- UniVid: The Open-Source Unified Video Model☆32Oct 13, 2025Updated 6 months ago
- 🔥🔥[NeurIPS2025]Exploring and mitigating semantic hallucinations in scene text perception and reasoning☆28Dec 11, 2025Updated 4 months ago
- Code for Neural Volume Reconstruction for Coherent Synthetic Aperture Sonar in SIGGRAPH 2023☆23Oct 28, 2023Updated 2 years ago
- Adaptive Multimodal Reasoning via Reinforcement Learning☆23Jan 11, 2026Updated 3 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Imagen-mini for girl image generation☆12Nov 19, 2022Updated 3 years ago
- Collection of papers about video-audio understanding☆25Dec 26, 2025Updated 4 months ago
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- [ICML2025] Official code for "Reinforced Lifelong Editing for Language Models"☆21Feb 23, 2025Updated last year
- ☆17Aug 29, 2024Updated last year
- Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".☆24Oct 22, 2025Updated 6 months ago
- [Neurips'24 and 25] "Acoustic Volume Rendering for Neural Impulse Response Fields" and "Resounding Acoustic Fields With Reciprocity"☆79Dec 18, 2025Updated 4 months ago
- misuka: A differentiable room acoustic renderer☆38Apr 29, 2026Updated last week
- ☆13Apr 13, 2026Updated 3 weeks ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆28Jul 23, 2025Updated 9 months ago
- This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…☆11Oct 9, 2024Updated last year
- Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos☆70Sep 5, 2025Updated 8 months ago
- Transformer: PyTorch Implementation of "Attention Is All You Need"☆15Dec 13, 2023Updated 2 years ago
- 机器学习乐园:主要包括机器学习基础,深度学习实践,工业应用。☆15Nov 14, 2022Updated 3 years ago
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…☆14Feb 15, 2023Updated 3 years ago