[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆127Nov 4, 2025Updated 7 months ago
Alternatives and similar repositories for StreamForest
Users that are interested in StreamForest are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant☆29Dec 2, 2025Updated 6 months ago
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆38Jan 14, 2026Updated 5 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆18Apr 2, 2025Updated last year
- [CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos☆11Jun 11, 2024Updated 2 years ago
- [Preprint] Self-Adversarial One Step Generation via Condition Shifting☆55Apr 15, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆23Apr 10, 2026Updated 2 months ago
- [CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression☆68Jun 8, 2026Updated 2 weeks ago
- Residual vector quantization for KV cache compression in large language model☆12Oct 22, 2024Updated last year
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- World Modeling by Forecasting Vision Foundation Model Features☆44Mar 20, 2026Updated 3 months ago
- Open-Vocabulary Panoptic Segmentation☆27Jun 15, 2025Updated last year
- TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models☆17Jan 2, 2025Updated last year
- Code of the Grounded MUIE model, REAMO☆10Dec 3, 2024Updated last year
- [CVPR 2026] Official Implementation of "Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models".☆24Jun 1, 2026Updated 3 weeks ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online☆96Oct 7, 2025Updated 8 months ago
- Code Repository for Research Article Titled - "Omnidirectional Video Super-Resolution using Deep Learning"☆13Apr 16, 2023Updated 3 years ago
- Joint A-SNN: Joint Training of Artificial and Spiking Neural Networks via Self-Distillation and Weight Factorization☆11Aug 1, 2023Updated 2 years ago
- [ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval☆120Nov 4, 2025Updated 7 months ago
- Real Spike: Learning Real-valued Spikes for Spiking Neural Networks☆11Jul 12, 2022Updated 3 years ago
- [ACL 2025] The official pytorch implement of "MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection".☆26May 26, 2025Updated last year
- ☆37Feb 7, 2026Updated 4 months ago
- ☆17Dec 13, 2023Updated 2 years ago
- [NeurIPS 2025] Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior☆79Feb 20, 2026Updated 4 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- 🔥🔥🔥 [Awesome] Latest Papers, Codes & Datasets on Streaming / Online Video Understanding — Building Always-on, Real-time Video AI 🤖☆319Updated this week
- [ICLR 2025] Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate☆24Apr 22, 2025Updated last year
- [NeurIPS'25] ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding☆53Sep 21, 2025Updated 9 months ago
- [Neurocomputing] EmoVerse: Enhancing Multimodal Large Language Models for Affective Computing via Multitask Learning☆19Jul 6, 2025Updated 11 months ago
- [CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning☆57Apr 1, 2025Updated last year
- ☆18Dec 2, 2024Updated last year
- [ECCV 2024] RGBD GS-ICP SLAM☆14Nov 5, 2024Updated last year
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆22Mar 23, 2026Updated 3 months ago
- This repository presents the source code for the paper "MILLION: Mastering Long-Context LLM Inference Via Outlier-Immunized KV Product Qu…☆25Apr 2, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams☆107Mar 15, 2026Updated 3 months ago
- [CVPR2025] Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models☆21Apr 30, 2025Updated last year
- ☆18Nov 30, 2025Updated 6 months ago
- [ICCV 2025 Highlight] "Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis“☆28May 31, 2026Updated 3 weeks ago
- Official Repo for PosSAM: Panoptic Open-vocabulary Segment Anything☆71Apr 7, 2024Updated 2 years ago
- ☆19Oct 22, 2023Updated 2 years ago
- RLHF for Video Diffusion Models☆26Jul 30, 2025Updated 10 months ago