The Source Code for OmniVideoBench @ICLR 2026
☆69Feb 12, 2026Updated last month
Alternatives and similar repositories for OmniVideoBench
Users that are interested in OmniVideoBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- https://avocado-captioner.github.io/☆32Oct 16, 2025Updated 5 months ago
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆39Updated this week
- [CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding☆45Mar 16, 2026Updated last week
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆44Mar 6, 2026Updated 3 weeks ago
- OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models☆66Mar 20, 2026Updated last week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Universal Video Temporal Grounding with Generative Multi-modal Large Language Models☆48Mar 20, 2026Updated last week
- [NeurIPS 2025] The official repository of "Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tun…☆40Feb 20, 2025Updated last year
- [NeurIPS 2025 Spotlight] Official PyTorch implementation of Vgent☆42Nov 30, 2025Updated 3 months ago
- ☆20Apr 23, 2024Updated last year
- A project for tri-modal LLM benchmarking and instruction tuning.☆56Mar 27, 2025Updated last year
- ☆13Jun 2, 2022Updated 3 years ago
- ☆27Mar 10, 2026Updated 2 weeks ago
- Official repository for the ICCV2023 paper SAFE: Sensitivity-Aware Features for Out-of-Distribution Object Detection☆13Jul 28, 2024Updated last year
- (ICLR 2025) AgentRefine: Enhancing Agent Generalization through Refinement Tuning☆19Nov 22, 2025Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- This repository contains the code for the paper “Neuro-Symbolic Query Compiler”, accepted to the Findings of ACL 2025.☆16Oct 20, 2025Updated 5 months ago
- SODA: Story Oriented Dense Video Captioning Evaluation Framework☆14May 3, 2024Updated last year
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆21Dec 22, 2025Updated 3 months ago
- ☆40Jan 16, 2026Updated 2 months ago
- ☆18Apr 4, 2025Updated 11 months ago
- F-16 is a powerful video large language model (LLM) that perceives high-frame-rate videos, which is developed by the Department of Electr…☆35Jul 3, 2025Updated 8 months ago
- video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is d…☆171Feb 23, 2026Updated last month
- build vgg16 with pytorch 0.4.0 for classification of CIFAR datasets☆10Mar 31, 2019Updated 6 years ago
- Official code for DeepSound-V1☆13May 14, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆146Dec 26, 2024Updated last year
- Score-aligned loudness, beat, and expressive markings data for 2000 Chopin Mazurka recordings☆14Jul 6, 2023Updated 2 years ago
- [𝐍𝐚𝐭𝐮𝐫𝐞 𝐂𝐨𝐦𝐩𝐮𝐭𝐚𝐭𝐢𝐨𝐧𝐚𝐥 𝐒𝐜𝐢𝐞𝐧𝐜𝐞] ⚡️ PSE/PSRN: Fast and efficient symbolic expression discovery through paralleliz…☆21Feb 3, 2026Updated last month
- ☆41Mar 21, 2026Updated last week
- NJU Computer Network Lab☆12Jul 2, 2021Updated 4 years ago
- 南京大学 NJU 计算机网络 计网 LAB☆11Jun 21, 2021Updated 4 years ago
- ☆33May 27, 2025Updated 10 months ago
- The official code of our paper “RAG-Critic: Leveraging Automated Critic-Guided Agentic Workflow for Retrieval Augmented Generation”☆27Aug 19, 2025Updated 7 months ago
- [MICCAI 2025] GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images☆15Mar 12, 2026Updated 2 weeks ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models☆24Jan 1, 2026Updated 2 months ago
- UMB: Understanding Model Behavior for Open-World object Detection (NeurIPS 2024)☆11May 26, 2024Updated last year
- 🔥🔥[NeurIPS2025]Exploring and mitigating semantic hallucinations in scene text perception and reasoning☆27Dec 11, 2025Updated 3 months ago
- The official source code of our AAAI25 paper "D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matchin…☆10Feb 9, 2025Updated last year
- ☆41Dec 16, 2025Updated 3 months ago
- [𝐍𝐚𝐭𝐮𝐫𝐞 𝐂𝐨𝐦𝐦𝐮𝐧𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬] 🤖💡 LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal C…☆23Mar 8, 2026Updated 3 weeks ago
- code for downloading videos from HowTo100M dataset☆17May 13, 2021Updated 4 years ago