Code of LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
☆31Nov 24, 2025Updated 4 months ago
Alternatives and similar repositories for LVAgent
Users that are interested in LVAgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos☆12Jun 11, 2024Updated last year
- [NeurIPS 2025] Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM☆24Feb 10, 2026Updated 2 months ago
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆27Jan 14, 2026Updated 3 months ago
- [NeurIPS2023] Neural-Logic Human-Object Interaction Detection☆14Aug 24, 2024Updated last year
- This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with H…☆16May 21, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [NeurIPS'25] ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding☆52Sep 21, 2025Updated 6 months ago
- ☆38May 28, 2025Updated 10 months ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆53Jun 12, 2025Updated 10 months ago
- This is the official repo for Contrastive Vision-Language Alignment Makes Efficient Instruction Learner.☆20Dec 1, 2023Updated 2 years ago
- Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"☆29Apr 16, 2024Updated 2 years ago
- official code for unigame☆19Nov 26, 2025Updated 4 months ago
- ☆22Oct 21, 2024Updated last year
- Entity-Aware and Motion-Aware Transformers for Language-driven Action Localization(IJCAI-22)☆12Oct 11, 2022Updated 3 years ago
- ☆12Jun 19, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official Implementation of SnAG (CVPR 2024)☆59Apr 26, 2025Updated 11 months ago
- ☆23Aug 20, 2024Updated last year
- Open Set Video HOI detection from Action-centric Chain-of-Look Prompting, ICCV2023☆12Oct 3, 2023Updated 2 years ago
- Med-DANet Series (ECCV 2022 & WACV 2024)☆13Jan 2, 2024Updated 2 years ago
- [CVPR 2026 Highlight] WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning☆73Mar 25, 2026Updated 3 weeks ago
- [NeurIPS 25] The official implementation of SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning☆27Sep 21, 2025Updated 6 months ago
- [AAAI 2026] ✨ TSPO: Temporal Sampling Policy Optimization for Long-form Video Language Understanding☆124Nov 12, 2025Updated 5 months ago
- SkillX: Automatically Constructing Skill Knowledge Bases for Agents☆56Apr 13, 2026Updated last week
- [ACL'26 Findings] Official code for "BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search"☆23Apr 13, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆11Jun 27, 2023Updated 2 years ago
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Models☆30Mar 18, 2026Updated last month
- ☆10Dec 3, 2024Updated last year
- ☆16Sep 11, 2025Updated 7 months ago
- Sparking "Thinking with Videos" via Reinforcement Learning☆156Oct 30, 2025Updated 5 months ago
- [ICCV 2025] The official pytorch implement of "LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs".☆22Oct 28, 2025Updated 5 months ago
- Offical implementation of "Re-Aligning Language to Visual Objects with an Agentic Workflow"☆32Apr 20, 2025Updated last year
- ☆53Feb 9, 2026Updated 2 months ago
- ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation (CVPR'25)☆21Apr 2, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The official implementation of Hard Negative Sampling via Large Language Models for Recommendation.☆11Jan 17, 2026Updated 3 months ago
- ☆13Jul 3, 2024Updated last year
- [ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos☆125Updated this week
- ☆18May 7, 2025Updated 11 months ago
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆42Jan 27, 2026Updated 2 months ago
- ☆44Jan 4, 2026Updated 3 months ago
- This repo holds the official code for the paper "FreMIM: Fourier Transform Meets Masked Image Modeling for Medical Image Segmentation".☆24Jan 2, 2024Updated 2 years ago