[ICCV 2025] StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition
☆64Jun 25, 2025Updated 10 months ago
Alternatives and similar repositories for StreamMind
Users that are interested in StreamMind are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)☆651Nov 26, 2025Updated 5 months ago
- Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025☆106Mar 14, 2025Updated last year
- [CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction☆172Mar 23, 2025Updated last year
- [CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant☆26Dec 2, 2025Updated 4 months ago
- [CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?☆138Jul 24, 2025Updated 9 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…☆44Feb 5, 2025Updated last year
- [CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online☆94Oct 7, 2025Updated 6 months ago
- LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)☆441Oct 29, 2025Updated 6 months ago
- [GCPR 2023] UGainS: Uncertainty Guided Anomaly Instance Segmentation☆16Jul 31, 2024Updated last year
- StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding☆156May 16, 2025Updated 11 months ago
- [NeurIPS'2025] Official repository for "LiveStar: Live Streaming Assistant for Real-World Online Video Understanding"☆115Nov 25, 2025Updated 5 months ago
- ☆52Feb 25, 2026Updated 2 months ago
- The implementation of RAGSynth: Synthetic Data for Robust and Faithful RAG Component Optimization☆21May 26, 2025Updated 11 months ago
- Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach☆14Apr 2, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding" [ACL 2026]☆73Apr 15, 2026Updated 2 weeks ago
- [ICLR 2024] DMBP: Diffusion Model-Based Predictor for Robust Offline Reinforcement Learning against State Observations Perturbations.☆17May 24, 2024Updated last year
- [CVPR 2025] EgoLife: Towards Egocentric Life Assistant☆419Mar 19, 2025Updated last year
- Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"☆29Apr 16, 2024Updated 2 years ago
- (ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"☆49Jul 1, 2025Updated 10 months ago
- [ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆17Feb 13, 2025Updated last year
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆44Mar 11, 2025Updated last year
- Repository of Streaming LLMs☆51Apr 16, 2026Updated 2 weeks ago
- Official Implementation of Video-MA2MBA☆12Dec 3, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Learning Large-scale Neural Fields via Context Pruned Meta-Learning (NeurIPS 2023)☆28Sep 24, 2023Updated 2 years ago
- A Comprehensive and Versatile Open-Source Federated Learning Framework☆33Apr 3, 2023Updated 3 years ago
- Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model☆89Nov 27, 2025Updated 5 months ago
- ☆72Apr 22, 2026Updated last week
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement [ACL 2026 Findings]"☆52Apr 7, 2026Updated 3 weeks ago
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆48Apr 3, 2025Updated last year
- ☆19Oct 28, 2025Updated 6 months ago
- GFGE☆15Sep 7, 2022Updated 3 years ago
- ☆12Apr 12, 2026Updated 2 weeks ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆50Jun 26, 2025Updated 10 months ago
- The ReprGesture entry to the GENEA Challenge 2022 (IMCI 2022)☆16Nov 8, 2022Updated 3 years ago
- A Population Based Reinforcement Learning Library based on PyTorch☆27Mar 5, 2023Updated 3 years ago
- ☆26Jun 20, 2024Updated last year
- TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models☆40Nov 10, 2024Updated last year
- Open-source strong baseline for domain generlization re-ID. We will udpate the strong baseline and CFD method~☆10Nov 30, 2021Updated 4 years ago
- ☆10Jan 5, 2020Updated 6 years ago