VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding
☆58Mar 24, 2026Updated last month
Alternatives and similar repositories for VideoDetective
Users that are interested in VideoDetective are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models☆24Apr 18, 2026Updated 2 weeks ago
- [EMNLP'2023 Findings] MoqaGPT, for zero-shot multimodal question answering with LLMs☆13Dec 28, 2024Updated last year
- Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"☆15Aug 30, 2021Updated 4 years ago
- ✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Vi…☆78Apr 28, 2025Updated last year
- ThalamusDB: semantic query processing on multimodal data☆115Aug 27, 2025Updated 8 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- TTRV: Test-Time Reinforcement Learning for Vision–Language Models (CVPR 2026)☆39Mar 8, 2026Updated last month
- ☆14Feb 26, 2024Updated 2 years ago
- Verify MAPPO in task ‘simple_spread_v3‘☆15Aug 10, 2024Updated last year
- ☆15Jul 10, 2019Updated 6 years ago
- Benchmarking Semantic Query Processing Engines☆53Updated this week
- [CVPR 2026] UFVideo: Towards Unified Fine-Grained Video Cooperative Understanding with Large Language Models☆37Feb 21, 2026Updated 2 months ago
- ☆33Feb 12, 2026Updated 2 months ago
- Sparrow: Data-Efficient Video-LLM with Text-to-Image Augmentation☆32Mar 28, 2025Updated last year
- OpenAI's Gym Car-Racing-V0 environment was tackled and, subsequently, solved using a variety of Reinforcement Learning methods including …☆22Aug 7, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression☆52Mar 13, 2026Updated last month
- ☆15Aug 12, 2022Updated 3 years ago
- ☆54Jan 5, 2026Updated 4 months ago
- [ACM Multimedia 2025] "Multi-Agent System for Comprehensive Soccer Understanding"☆74Oct 31, 2025Updated 6 months ago
- ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model☆16Apr 7, 2026Updated 3 weeks ago
- [CVPR 2026] FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning☆52Mar 26, 2026Updated last month
- CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms☆25Dec 21, 2025Updated 4 months ago
- 合肥工业大学计科硬件综合设计简易版-单周期MIPS CPU☆37Feb 16, 2023Updated 3 years ago
- Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.☆16Oct 25, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning☆37Mar 12, 2026Updated last month
- EAT-NAS: Elastic Architecture Transfer for Accelerating Large-scale Neural Architecture Search☆25Apr 10, 2019Updated 7 years ago
- A Multi-Agent Approach Integrating Socratic Guidance for Automated Prompt Optimization☆18Dec 15, 2025Updated 4 months ago
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆21Feb 14, 2025Updated last year
- ☆24Aug 9, 2025Updated 8 months ago
- Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion☆128Mar 12, 2026Updated last month
- [ECCV 2024] Official code repository of paper titled "Efficient 3D-Aware Facial Image Editing Via Attribute-Specific Prompt Learning"☆10Aug 2, 2024Updated last year
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆20Nov 4, 2025Updated 6 months ago
- Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering [ACM MM'24]☆10Jul 22, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [arXiv'25] LiCoMemory: Lightweight and Cognitive Agentic Memory for Efficient Long-Term Reasoning☆42Jan 6, 2026Updated 3 months ago
- Multigranularity Contrastive cross-modal collaborative Generation (MCG) model for Video QA☆12Dec 13, 2023Updated 2 years ago
- [ICLR 2026] Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks☆30Feb 5, 2026Updated 3 months ago
- The source code of Mem-Gallery: Benchmarking Multimodal Long-Term Conversational Memory for MLLM Agents.☆52Jan 31, 2026Updated 3 months ago
- [CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding☆59Mar 16, 2026Updated last month
- InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem☆21Apr 7, 2026Updated 3 weeks ago
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆22Sep 24, 2025Updated 7 months ago