ncTimTang/AKS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ncTimTang/AKS)

ncTimTang / AKS

[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding

☆228

Alternatives and similar repositories for AKS

Users that are interested in AKS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sming256 / BOLT
View on GitHub
[CVPR2025] BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding
☆55Feb 5, 2026Updated 5 months ago
NUS-HPC-AI-Lab / FOCUS
View on GitHub
[ICLR 2026] FOCUS: Efficient Keyframe Selection for Long Video Understanding
☆74Apr 23, 2026Updated 2 months ago
qiujihao19 / LongVideo-R1
View on GitHub
[CVPR 2026] LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding
☆50Jul 7, 2026Updated 2 weeks ago
Jialuo-Li / DIG
View on GitHub
[CVPR 2026] Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding
☆21Feb 21, 2026Updated 5 months ago
hmxiong / StreamChat
View on GitHub
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆111Mar 14, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MAC-AutoML / WFS-SB
View on GitHub
[CVPR 2026] Wavelet-based Frame Selection by Detecting Semantic Boundary for Long Video Understanding
☆31Apr 12, 2026Updated 3 months ago
yunzhuzhang0918 / flexselect
View on GitHub
The official repository for paper "FlexSelect: Flexible Token Selection for Efficient Long Video Understanding".
☆31Sep 19, 2025Updated 10 months ago
mll-lab-nu / TStar
View on GitHub
TStar is a unified temporal search framework for long-form video question answering
☆97Mar 23, 2026Updated 4 months ago
xiaomi-research / q-frame
View on GitHub
[ICCV 2025] Implementation of the paper "Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs"
☆81Oct 25, 2025Updated 8 months ago
yaolinli / GenS
View on GitHub
[ACL 2025 Findings] GenS: Generative Frame Sampler for Long Video Understanding
☆22Aug 21, 2025Updated 11 months ago
mingrui-wu / OSI-Bench
View on GitHub
Official repo of From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs
☆24Jun 23, 2026Updated last month
fansunqi / AKeyS
View on GitHub
Agentic Keyframe Search for Video Question Answering
☆18Jun 30, 2026Updated 3 weeks ago
wgcyeo / WorldMM
View on GitHub
[CVPR 2026 Highlight] WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning
☆96Jun 18, 2026Updated last month
64327069 / LVAgent
View on GitHub
Code of LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
☆39Nov 24, 2025Updated 7 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ShareLab-SII / FluxMem
View on GitHub
[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
☆73Mar 16, 2026Updated 4 months ago
lcqysl / FrameThinker
View on GitHub
[ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"
☆50Oct 9, 2025Updated 9 months ago
Hui-design / TSPO
View on GitHub
[AAAI 2026] ✨ TSPO: Temporal Sampling Policy Optimization for Long-form Video Language Understanding
☆131Nov 12, 2025Updated 8 months ago
NVlabs / VideoITG
View on GitHub
[CVPR 2026 Highlight] VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding
☆126Apr 17, 2026Updated 3 months ago
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆132Nov 4, 2025Updated 8 months ago
EliSpectre / MM-Mem
View on GitHub
[ACL-26 (main)] From Verbatim to Gist Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video A…
☆39Apr 19, 2026Updated 3 months ago
qirui-chen / MultiHop-EgoQA
View on GitHub
[AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
☆38May 27, 2025Updated last year
qiujihao19 / Artemis
View on GitHub
[NeurIPS 2024] Artemis: Towards Referential Understanding in Complex Videos
☆27Apr 8, 2025Updated last year
Fanziyang-v / FlashVID
View on GitHub
[ICLR 2026 Oral] FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging
☆112Apr 30, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jylins / videoseek
View on GitHub
[CVPR 2026] VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking
☆64Mar 23, 2026Updated 4 months ago
cokeshao / HoliTom
View on GitHub
[NeurIPS 2025] HoliTom: Holistic Token Merging for Fast Video Large Language Models
☆84Oct 10, 2025Updated 9 months ago
steven-ccq / ViLAMP
View on GitHub
[ICML 2025] Official repository for paper "Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation"
☆194Sep 23, 2025Updated 10 months ago
lwpyh / CoS_codes
View on GitHub
CoS: Chain-of-Shot Prompting for Long Video Understanding
☆53Feb 13, 2025Updated last year
VectorSpaceLab / Video-XL
View on GitHub
🔥🔥First-ever hour scale video understanding models
☆626Jul 14, 2025Updated last year
LunarShen / FastVID
View on GitHub
[NeurIPS 2025] FastVID: Dynamic Density Pruning for Fast Video Large Language Models
☆37Nov 10, 2025Updated 8 months ago
KD-TAO / OmniZip
View on GitHub
[CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
☆99Apr 20, 2026Updated 3 months ago
yunlong10 / Awesome-Video-LMM-Post-Training
View on GitHub
🔥🔥🔥 Latest Papers, Codes and Datasets on Video-LMM Post-Training
☆296Mar 3, 2026Updated 4 months ago
Leon1207 / Video-RAG-master
View on GitHub
✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehensi…
☆446Jun 26, 2026Updated 3 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
marinero4972 / Open-o3-Video
View on GitHub
[ICML 2026] Official implementation of "Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence"
☆157May 1, 2026Updated 2 months ago
tulerfeng / Video-R1
View on GitHub
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
☆881Dec 14, 2025Updated 7 months ago
lern-to-write / STC
View on GitHub
[CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
☆70Jun 8, 2026Updated last month
FeiElysia / Tempo
View on GitHub
Tempo: Small Vision-Language Models are Smart Compressors for Long Video Understanding (ECCV 2026)
☆76Jun 29, 2026Updated 3 weeks ago
RUC-NLPIR / VideoDeepResearch
View on GitHub
☆155Nov 17, 2025Updated 8 months ago
HYUNJS / STTM
View on GitHub
[ICCV 2025] Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs
☆61Feb 2, 2026Updated 5 months ago
mit-han-lab / streaming-vlm
View on GitHub
StreamingVLM: Real-Time Understanding for Infinite Video Streams
☆1,046Oct 15, 2025Updated 9 months ago