Lliar-liar / Daily-OmniView external linksLinks
This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities
☆36Jul 4, 2025Updated 7 months ago
Alternatives and similar repositories for Daily-Omni
Users that are interested in Daily-Omni are comparing it to the libraries listed below
Sorting:
- The Source Code for OmniVideoBench☆57Updated this week
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆38Jan 26, 2026Updated 3 weeks ago
- Implementation for "StyleGAN-Canvas: Augmenting StyleGAN3 for Real-Time Human-AI Co-Creation"☆11May 24, 2023Updated 2 years ago
- SemEval2026 Task 3 DimABSA☆26Feb 10, 2026Updated last week
- Virtual character locomotion system. See article“Motion Graphs”, Lucas Kovar, 2002☆12Mar 1, 2012Updated 13 years ago
- https://avocado-captioner.github.io/☆29Oct 16, 2025Updated 4 months ago
- ☆39Jan 16, 2026Updated last month
- MiniLM (BERT) embeddings from scratch☆17Aug 14, 2025Updated 6 months ago
- Wenet speech to text for react native☆10Nov 1, 2022Updated 3 years ago
- YOURLS plugin that allows you to change the default behaviour of YOURLS to send 302 redirects instead of 301.☆12Nov 22, 2021Updated 4 years ago
- Live media content delivery network based on the WebRTC protocol.☆13Jun 12, 2025Updated 8 months ago
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- [COLM 2024] LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models☆14Jan 4, 2025Updated last year
- [ICML 2025 Oral] This is the official repository of the paper "What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensi…☆20Jun 12, 2025Updated 8 months ago
- Web app for makeup transfer using Stable Diffusion☆10Sep 11, 2023Updated 2 years ago
- Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"☆20Nov 1, 2025Updated 3 months ago
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆13Jan 27, 2025Updated last year
- ☆13Jan 8, 2021Updated 5 years ago
- Pytorch Tutorial for M1 students. This repository include Encoder Deocder model and Classification model building code.☆12Jun 1, 2022Updated 3 years ago
- Real-time melgan based on cpu !!!☆13Dec 3, 2019Updated 6 years ago
- ☆11May 7, 2022Updated 3 years ago
- ☆12Jan 24, 2024Updated 2 years ago
- [WIP] Python port/rewrite of pbrt, the physically based renderer by Matt Pharr and Greg Humphreys☆13May 19, 2013Updated 12 years ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated 10 months ago
- [ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videos☆24Aug 8, 2025Updated 6 months ago
- Data and models for Misinfo Reaction Frames paper.☆14Jun 9, 2024Updated last year
- Source code and data for the EDM 2022 paper☆12May 16, 2022Updated 3 years ago
- Chainer implementation of StackGAN☆13Mar 28, 2018Updated 7 years ago
- [ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…☆16Feb 22, 2025Updated 11 months ago
- ECIR 2024: Sparse lexical representation for image-text retrieval☆12Jul 8, 2024Updated last year
- LMM for VQA, tcsvt version☆11Jul 19, 2024Updated last year
- A cage-based deformation for meshes in 2D.☆14Sep 8, 2018Updated 7 years ago
- A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder for Language Modeling☆15Dec 5, 2023Updated 2 years ago
- Create Persona dataset from reddit en movie category comment☆11Aug 6, 2021Updated 4 years ago
- Collection of papers about video-audio understanding☆22Dec 26, 2025Updated last month
- Imagen-mini for girl image generation☆12Nov 19, 2022Updated 3 years ago
- ☆28Sep 4, 2025Updated 5 months ago
- Code for paper "Conversational Product Search Based on Negative Feedback"☆12Jun 26, 2020Updated 5 years ago
- Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"☆15Jan 25, 2024Updated 2 years ago