๐ A curated collection of papers and open-source code repositories dedicated to the application of Vision-Language Models (VLMs) for streaming video.
โ180Jun 10, 2026Updated 2 weeks ago
Alternatives and similar repositories for Awesome-VLM-Streaming-Video
Users that are interested in Awesome-VLM-Streaming-Video are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2026] FOCUS: Efficient Keyframe Selection for Long Video Understandingโ72Apr 23, 2026Updated 2 months ago
- [ICLR 2025 Spotlight] Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Modelโ16Apr 23, 2025Updated last year
- [CVPR 2026 Highlight] VideoITG: Multimodal Video Understanding with Instructed Temporal Groundingโ122Apr 17, 2026Updated 2 months ago
- [ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"โ50Oct 9, 2025Updated 8 months ago
- [๐๐๐ญ๐ฎ๐ซ๐ ๐๐จ๐ฆ๐ฉ๐ฎ๐ญ๐๐ญ๐ข๐จ๐ง๐๐ฅ ๐๐๐ข๐๐ง๐๐] โก๏ธ PSE/PSRN: Fast and efficient symbolic expression discovery through parallelizโฆโ22May 17, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer โข AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [Roadmap] Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modelingโ120Jun 9, 2026Updated 2 weeks ago
- PromptRose ๐น is your AI prompt companion, blooming at your fingertips.โ22Sep 1, 2025Updated 9 months ago
- โ28Jun 1, 2026Updated 3 weeks ago
- ๅฝ็งๅคง้ๆ ๆนๆ กๅบ2024~2025ๅนด่ฏพ็จ่ตๆ๏ผๅ ๆฌๅผบๅๅญฆไน ใๆบ่ฝ่ฎก็ฎ็ณป็ปใๆจกๅผ่ฏๅซใ็ฉ้ตๅๆไธๅบ็จใไบบๅทฅๆบ่ฝๅ็ไธ็ฎๆณใ่ช็ถ่ฏญ่จๅค็โ50Sep 22, 2025Updated 9 months ago
- โ18May 18, 2026Updated last month
- Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentationโ19Nov 28, 2022Updated 3 years ago
- [CVPR2022] Official Implementation of the paper 'Learning Where to Learn in Cross-View Self-Supervised Learning'โ29Oct 12, 2022Updated 3 years ago
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"โ26Feb 2, 2025Updated last year
- [CVPR2025] "AniMo: Species-Aware Model for Text-Driven Animal Motion Generation"โ47Oct 8, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways โข AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videosโ38May 27, 2025Updated last year
- Awesome latest models, datasets and benchmarks on streaming/online video understanding.โ30Oct 19, 2025Updated 8 months ago
- [ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semanticsโ37Sep 10, 2025Updated 9 months ago
- โ11Oct 4, 2023Updated 2 years ago
- Official code repository for the paper A Large-scale AI-generated Image Inpainting Benchmarkโ16Jan 13, 2026Updated 5 months ago
- [NeurIPS 2025] Deep Memory Backtracking for Long Video Understandingโ68Feb 10, 2026Updated 4 months ago
- โ55Apr 7, 2026Updated 2 months ago
- Code for "CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects", NeurIPS 2025โ90Mar 25, 2026Updated 3 months ago
- [CVPR'25] Attention IoU: Examining Biases in CelebA using Attention Mapsโ13Mar 26, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer โข AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- โ20Jun 10, 2025Updated last year
- โ28Mar 17, 2026Updated 3 months ago
- Gifts for landscape photographers. Help the photographer seeking for meteors in the photo sequence.โ13Jun 21, 2022Updated 4 years ago
- โ14Apr 23, 2025Updated last year
- Official repository for โReasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Spaceโโ18Jan 27, 2026Updated 5 months ago
- Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detectionโ31Mar 19, 2025Updated last year
- [ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decodingโ23Mar 2, 2025Updated last year
- ๐ A collection of resources and papers on Large Language Models in autonomous drivingโ27Oct 30, 2023Updated 2 years ago
- Pytorch implementation of the TPAMI paper of "HiGCIN: Hierarchical Graph-based Cross Inference Network for Group Activity Recognition"โ18Dec 10, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- SceneCompleter: Dense 3D Scene Completion for Generative Novel View Synthesisโ36Jun 13, 2025Updated last year
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streamsโ107Mar 15, 2026Updated 3 months ago
- โ72Sep 3, 2025Updated 9 months ago
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learningโ38Jan 14, 2026Updated 5 months ago
- [ICLR 2026] An official implementation of "STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence"โ43Apr 19, 2026Updated 2 months ago
- Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insightsโ32Jan 9, 2026Updated 5 months ago
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inferenceโ10Dec 15, 2024Updated last year