๐ A curated collection of papers and open-source code repositories dedicated to the application of Vision-Language Models (VLMs) for streaming video.
โ158May 12, 2026Updated last week
Alternatives and similar repositories for Awesome-VLM-Streaming-Video
Users that are interested in Awesome-VLM-Streaming-Video are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2026] FOCUS: Efficient Keyframe Selection for Long Video Understandingโ70Apr 23, 2026Updated 3 weeks ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)โ23Aug 1, 2025Updated 9 months ago
- [CVPR 2026 Highlight] VideoITG: Multimodal Video Understanding with Instructed Temporal Groundingโ118Apr 17, 2026Updated last month
- [ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"โ46Oct 9, 2025Updated 7 months ago
- [๐๐๐ญ๐ฎ๐ซ๐ ๐๐จ๐ฆ๐ฉ๐ฎ๐ญ๐๐ญ๐ข๐จ๐ง๐๐ฅ ๐๐๐ข๐๐ง๐๐] โก๏ธ PSE/PSRN: Fast and efficient symbolic expression discovery through parallelizโฆโ22Feb 3, 2026Updated 3 months ago
- Open source password manager - Proton Pass โข AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [Roadmap] Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modelingโ104Updated this week
- Awesome papers for affective computing with llm and mllmโ24Nov 26, 2025Updated 5 months ago
- [๐๐๐ญ๐ฎ๐ซ๐ ๐๐จ๐ฆ๐ฆ๐ฎ๐ง๐ข๐๐๐ญ๐ข๐จ๐ง๐ฌ] ๐ค๐ก LiveIdeaBench: Evaluating LLMs' Scientific Creativity and Idea Generation with Minimal Cโฆโ26Apr 21, 2026Updated 3 weeks ago
- โ12Apr 29, 2024Updated 2 years ago
- Visual Speech Recongnitionโ20Dec 24, 2024Updated last year
- โ22May 10, 2026Updated last week
- LineArt, a framework that transfers complex appearance onto detailed design drawings, facilitating design and artistic creation.โ15Oct 2, 2025Updated 7 months ago
- โ18Apr 4, 2025Updated last year
- Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentationโ19Nov 28, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways โข AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ๅฝ็งๅคง้ๆ ๆนๆ กๅบ2024~2025ๅนด่ฏพ็จ่ตๆ๏ผๅ ๆฌๅผบๅๅญฆไน ใๆบ่ฝ่ฎก็ฎ็ณป็ปใๆจกๅผ่ฏๅซใ็ฉ้ตๅๆไธๅบ็จใไบบๅทฅๆบ่ฝๅ็ไธ็ฎๆณใ่ช็ถ่ฏญ่จๅค็โ41Sep 22, 2025Updated 7 months ago
- [AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videosโ36May 27, 2025Updated 11 months ago
- Awesome latest models, datasets and benchmarks on streaming/online video understanding.โ28Oct 19, 2025Updated 7 months ago
- [ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semanticsโ37Sep 10, 2025Updated 8 months ago
- โ11Oct 4, 2023Updated 2 years ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"โ11Dec 30, 2024Updated last year
- Official code repository for the paper A Large-scale AI-generated Image Inpainting Benchmarkโ16Jan 13, 2026Updated 4 months ago
- Official training code for MUG-V 10B video generation model. Built on Megatron-LM (v0.14.0) with production-ready distributed training foโฆโ20Oct 20, 2025Updated 6 months ago
- [NeurIPS 2025] Deep Memory Backtracking for Long Video Understandingโ67Feb 10, 2026Updated 3 months ago
- 1-Click AI Models by DigitalOcean Gradient โข AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- โ54Apr 7, 2026Updated last month
- Code for "CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects", NeurIPS 2025โ89Mar 25, 2026Updated last month
- [CVPR'25] Attention IoU: Examining Biases in CelebA using Attention Mapsโ13Mar 26, 2025Updated last year
- code for downloading videos from HowTo100M datasetโ18May 13, 2021Updated 5 years ago
- โ20Jun 10, 2025Updated 11 months ago
- Gifts for landscape photographers. Help the photographer seeking for meteors in the photo sequence.โ13Jun 21, 2022Updated 3 years ago
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streamsโ99Mar 15, 2026Updated 2 months ago
- โ13Apr 23, 2025Updated last year
- [ACL2026] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmarkโ25Apr 13, 2026Updated last month
- Managed Database hosting by DigitalOcean โข AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Official repository for โReasoning in the Dark: Interleaved Vision-Text Reasoning in Latent Spaceโโ18Jan 27, 2026Updated 3 months ago
- [ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decodingโ23Mar 2, 2025Updated last year
- โ35May 29, 2025Updated 11 months ago
- Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detectionโ31Mar 19, 2025Updated last year
- ๐ A collection of resources and papers on Large Language Models in autonomous drivingโ27Oct 30, 2023Updated 2 years ago
- Pytorch implementation of the TPAMI paper of "HiGCIN: Hierarchical Graph-based Cross Inference Network for Group Activity Recognition"โ18Dec 10, 2020Updated 5 years ago
- โ70Sep 3, 2025Updated 8 months ago