fansunqi/AKeyS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/fansunqi/AKeyS)

fansunqi / AKeyS

Agentic Keyframe Search for Video Question Answering

☆18

Alternatives and similar repositories for AKeyS

Users that are interested in AKeyS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

mbzuai-oryx / LongShOT
View on GitHub
A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos
☆21Jun 20, 2026Updated last month
daniel-cores / tvbench
View on GitHub
TVBench: Redesigning Video-Language Evaluation
☆15Jun 9, 2025Updated last year
Upper9527 / DrVideo
View on GitHub
Code of "DrVideo: Document Retrieval Based Long Video Understanding"
☆98Aug 11, 2025Updated 11 months ago
agentic-learning-ai-lab / lifelong-memory
View on GitHub
Code for LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos
☆33Oct 27, 2025Updated 8 months ago
gls0425 / LinVT
View on GitHub
LinVT: Empower Your Image-level Large Language Model to Understand Videos
☆84Dec 30, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
WissingChen / CRA-GQA
View on GitHub
The official implementation of "Cross-modal Causal Relation Alignment for Video Question Grounding. (CVPR 2025 Highlight)"
☆52Apr 27, 2025Updated last year
aimagelab / PMA-Net
View on GitHub
[ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.
☆19Jun 7, 2024Updated 2 years ago
OpenGVLab / PVC
View on GitHub
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆54Jun 12, 2025Updated last year
jongwoopark7978 / LVNet
View on GitHub
[Main Conference @ EACL'26] [Workshop @ NeurIPS'24] 🎞️ LVNet.
☆44Feb 10, 2026Updated 5 months ago
wxh1996 / VideoAgent
View on GitHub
☆150Apr 16, 2025Updated last year
d223302 / A-Closer-Look-To-LLM-Evaluation
View on GitHub
Code for EMNLP 2023 findings paper "A Closer Look into Using Large Language Models for Automatic Evaluation"
☆19Oct 9, 2023Updated 2 years ago
xiaoqian-shen / Vgent
View on GitHub
[NeurIPS 2025 Spotlight] Official PyTorch implementation of Vgent
☆48Nov 30, 2025Updated 7 months ago
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
fansunqi / VideoTool
View on GitHub
Official Repository for NeurIPS'25 Paper "Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task"
☆23May 18, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Lkydong2020 / SEAL_WTAL
View on GitHub
[AAAI-25]Code for SEAL
☆15Sep 25, 2025Updated 9 months ago
JasonCodeMaker / CTVR
View on GitHub
☆16Jun 2, 2025Updated last year
FishAndWasabi / Real-LOD
View on GitHub
Offical implementation of "Re-Aligning Language to Visual Objects with an Agentic Workflow"
☆34Apr 20, 2025Updated last year
Ziyang412 / VideoTree
View on GitHub
Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
☆165Jun 23, 2025Updated last year
Haiyang0226 / Symphony
View on GitHub
code of cvpr26 paper Symphony
☆17Apr 7, 2026Updated 3 months ago
mlvlab / DeepVideoR1
View on GitHub
[NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"
☆36Feb 22, 2026Updated 4 months ago
snumprlab / isr-dpo
View on GitHub
Official Implementation of ISR-DPO:Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO (AAAI'25)
☆23Nov 25, 2025Updated 7 months ago
NJU-LINK / IF-VidCap
View on GitHub
The Source Code for IF-VidCap @ICLR 2026
☆19Oct 22, 2025Updated 9 months ago
yahooo-m / VOS-Solution
View on GitHub
ECCV 2024 STMA & CVPR 2024 1st MOSE & 1st VOT Challenge & 1st LSVOS v6
☆12Oct 16, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
sejong-rcv / PVLR
View on GitHub
[ACM MM-24] Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization
☆13Oct 8, 2024Updated last year
joslefaure / HERMES
View on GitHub
[ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics
☆37Sep 10, 2025Updated 10 months ago
assembly-101 / assembly101-action-recognition
View on GitHub
Code and models for the Action Recognition benchmark of Assembly101
☆14Mar 26, 2023Updated 3 years ago
wangsen99 / LMEE
View on GitHub
(CVPR 26) Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration
☆35Mar 8, 2026Updated 4 months ago
KD-TAO / OmniAgent
View on GitHub
OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding
☆22Apr 9, 2026Updated 3 months ago
TIGER-AI-Lab / QuickVideo
View on GitHub
Quick Long Video Understanding [TMLR2025]
☆79Oct 27, 2025Updated 8 months ago
ncTimTang / AKS
View on GitHub
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
☆228Dec 19, 2025Updated 7 months ago
CeeZh / SILVR
View on GitHub
Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"
☆19Jan 18, 2026Updated 6 months ago
zuucan / NeedleInAHaystack-PLUS
View on GitHub
To assess the longtext capabilities more comprehensively, we propose Needle-in-a-Haystack PLUS, which shifts the focus from simple fact r…
☆13Mar 4, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kkahatapitiya / LangRepo
View on GitHub
Code for our ACL 2025 paper "Language Repository for Long Video Understanding"
☆36Jun 17, 2024Updated 2 years ago
aurooj / SHG-VQA
View on GitHub
Learning Situation Hyper-Graphs for Video Question Answering
☆23Feb 16, 2024Updated 2 years ago
64327069 / LVAgent
View on GitHub
Code of LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
☆39Nov 24, 2025Updated 7 months ago
dengandong / GroundMoRe
View on GitHub
☆18May 18, 2026Updated 2 months ago
boreng0817 / IFCap
View on GitHub
[EMNLP 2024] IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning
☆15May 13, 2025Updated last year
AdaCheng / VidEgoThink
View on GitHub
The official code and data for paper "VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI"
☆18Mar 25, 2025Updated last year
qijimrc / ROBUST
View on GitHub
☆13Oct 19, 2023Updated 2 years ago