[CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
☆67Jun 4, 2026Updated this week
Alternatives and similar repositories for STC
Users that are interested in STC are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025] Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs☆60Feb 2, 2026Updated 4 months ago
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.☆91Dec 24, 2025Updated 5 months ago
- [EMNLP 2025 Main] Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models☆124May 14, 2026Updated 3 weeks ago
- [EMNLP 2025 Findings] 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation☆37Jun 12, 2025Updated 11 months ago
- ☆14Jun 16, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Unofficial Scalable-Softmax Is Superior for Attention☆20May 30, 2025Updated last year
- 🔥🔥🔥 [Awesome] Latest Papers, Codes & Datasets on Streaming / Online Video Understanding — Building Always-on, Real-time Video AI 🤖☆278Jun 2, 2026Updated last week
- Source code of the paper: Overlapped Trajectory-Enhanced Visual Tracking☆11Sep 3, 2024Updated last year
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆18Apr 2, 2025Updated last year
- 将pdf分成彩色和黑白部分,便于打印☆11Mar 9, 2025Updated last year
- The official source code of our AAAI25 paper "D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matchin…☆10Feb 9, 2025Updated last year
- ☆45Jan 1, 2026Updated 5 months ago
- [CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models☆87Apr 20, 2026Updated last month
- ☆171Apr 27, 2026Updated last month
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A4-Agent: An Agentic Framework for Zero-Shot Affordance Reasoning☆41Mar 12, 2026Updated 2 months ago
- [AAAI2025] Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark☆30Apr 4, 2026Updated 2 months ago
- Robust Tracking via Mamba-based Context-aware Token Learning (AAAI 2025)☆16Nov 6, 2025Updated 7 months ago
- ☆12Feb 13, 2025Updated last year
- [ACL 2025] PruneVid: Visual Token Pruning for Efficient Video Large Language Models☆71May 15, 2025Updated last year
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆35Jan 14, 2026Updated 4 months ago
- YAICON 3rd project page - 4D Gaussian for Head Reconstruction☆11Dec 22, 2023Updated 2 years ago
- ☆13Jan 7, 2025Updated last year
- [CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction☆178Mar 23, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- survery of small language models☆18Jul 23, 2024Updated last year
- [NeurIPS 2025] FastVID: Dynamic Density Pruning for Fast Video Large Language Models☆36Nov 10, 2025Updated 6 months ago
- Pytorch code for paper Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization☆22Jan 7, 2023Updated 3 years ago
- Official PyTorch Implementation of "Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching"☆32Mar 1, 2026Updated 3 months ago
- An unofficial implementation using Pytorch for "Anomaly Clustering: Grouping Images into Coherent Clusters of Anomaly Types". Improve the…☆18Nov 17, 2023Updated 2 years ago
- A token pruning method that accelerates ViTs for various tasks while maintaining high performance.☆28Jul 21, 2025Updated 10 months ago
- Code of paper 'Stochastic Layer-Wise Shuffle for Improving Vision Mamba Training'☆22Jun 10, 2025Updated 11 months ago
- ☆17Apr 15, 2025Updated last year
- [ICCV 2023] Simple Baselines for Interactive Video Retrieval with Questions and Answers☆20Apr 16, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding" [ACL 2026]☆86May 8, 2026Updated last month
- [ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval☆119Nov 4, 2025Updated 7 months ago
- [ICLR 2026] Official code repository for "⚡️VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration"☆49Feb 24, 2026Updated 3 months ago
- Source code of the paper "Attention as Relation: Learning Supervised Multi-head Self-Attention for Relation Extraction, IJCAI 2020."☆18Nov 16, 2020Updated 5 years ago
- [ICCV 2025] This repo is the official implementation of "Music Grounding by Short Video"☆27Sep 9, 2025Updated 9 months ago
- [ICME 2025 Oral] The official implementation of TSTMotion in pytorch☆19May 31, 2025Updated last year
- FreeVA: Offline MLLM as Training-Free Video Assistant☆69Jun 9, 2024Updated 2 years ago