[ICCV 2025] StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition
☆63Jun 25, 2025Updated 9 months ago
Alternatives and similar repositories for StreamMind
Users that are interested in StreamMind are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)☆657Nov 26, 2025Updated 4 months ago
- Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025☆102Mar 14, 2025Updated last year
- [CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction☆170Mar 23, 2025Updated last year
- [CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant☆27Dec 2, 2025Updated 4 months ago
- [CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?☆130Jul 24, 2025Updated 8 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- ☆18Aug 7, 2025Updated 8 months ago
- Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…☆44Feb 5, 2025Updated last year
- [CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online☆95Oct 7, 2025Updated 6 months ago
- This is the official implementation of ICCV 2025 "Flash-VStream: Efficient Real-Time Understanding for Long Video Streams"☆273Oct 15, 2025Updated 5 months ago
- StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding☆152May 16, 2025Updated 10 months ago
- ☆51Feb 25, 2026Updated last month
- PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation [NeurIPS 2025]☆18Oct 11, 2025Updated 6 months ago
- The implementation of RAGSynth: Synthetic Data for Robust and Faithful RAG Component Optimization☆21May 26, 2025Updated 10 months ago
- Code Repository for Research Article Titled - "Omnidirectional Video Super-Resolution using Deep Learning"☆14Apr 16, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding"☆67Updated this week
- ☆56Updated this week
- [CVPR 2025] EgoLife: Towards Egocentric Life Assistant☆409Mar 19, 2025Updated last year
- Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"☆29Apr 16, 2024Updated last year
- [ICCV 2025] VLM4D: Towards Spatiotemporal Awareness in Vision Language Models☆43Nov 20, 2025Updated 4 months ago
- (ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"☆47Jul 1, 2025Updated 9 months ago
- [ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆17Feb 13, 2025Updated last year
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆44Mar 11, 2025Updated last year
- Official Implementation of Video-MA2MBA☆12Dec 3, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆33Nov 7, 2023Updated 2 years ago
- [IEEE VR'22] SPAA: Stealthy Projector-based Adversarial Attacks on Deep Image Classifiers☆12Jun 21, 2025Updated 9 months ago
- A Comprehensive and Versatile Open-Source Federated Learning Framework☆33Apr 3, 2023Updated 3 years ago
- Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model☆87Nov 27, 2025Updated 4 months ago
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement [ACL 2026 Findings]"☆53Updated this week
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆46Apr 3, 2025Updated last year
- ☆19Oct 28, 2025Updated 5 months ago
- Disentangled Pre-training for Human-Object Interaction Detection☆27Sep 17, 2025Updated 6 months ago
- 2025年深圳大学办公区校园网新版登录脚本。2025 Shenzhen University Office Area Campus Network New Version Login Script☆11Jan 17, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- GFGE☆15Sep 7, 2022Updated 3 years ago
- Depth-Guided Dynamic Neural Radiance Field using RGB-D data☆16Apr 4, 2023Updated 3 years ago
- ☆12Jun 26, 2024Updated last year
- ☆49Jun 26, 2025Updated 9 months ago
- The ReprGesture entry to the GENEA Challenge 2022 (IMCI 2022)☆16Nov 8, 2022Updated 3 years ago
- Evaluation metrics and submission file creation scripts the Action Recognition challenge☆15Feb 9, 2026Updated 2 months ago
- A Population Based Reinforcement Learning Library based on PyTorch☆27Mar 5, 2023Updated 3 years ago