Zplusdragon / VideoLucyLinks

[NeurIPS2025] Deep Memory Backtracking for Long Video Understanding

☆24

Alternatives and similar repositories for VideoLucy

Users that are interested in VideoLucy are comparing it to the libraries listed below

Sorting:

guanw-pku / OED
Official implementation of paper "OED: Towards One-stage End-to-End Dynamic Scene Graph Generation".
☆23Updated last year
rkzheng99 / ViLLa
Video Reasoning Segmentation
☆27Updated 11 months ago
shilinyan99 / CrossLMM
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
☆24Updated 4 months ago
cilinyan / VISA
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆193Updated last year
showlab / VideoLISA
[NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos
☆139Updated 10 months ago
franciszzj / OpenPSG
[ECCV 2024] OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models
☆49Updated 9 months ago
Lzq5 / UniTime
Universal Video Temporal Grounding with Generative Multi-modal Large Language Models
☆30Updated 2 weeks ago
cilinyan / ReVOS-api
[ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model
☆19Updated last year
zhousheng97 / EgoTextVQA
[CVPR'25] 🌟🌟 EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering
☆39Updated 4 months ago
baoxiaoyi / CoReS
code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"
☆18Updated 7 months ago
congvvc / InstructSeg
[ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"
☆48Updated 8 months ago
RobertLuo1 / NeurIPS2023_SOC
[NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation
☆33Updated last year
clownrat6 / OpenVIS
[AAAI 2025] Open-vocabulary Video Instance Segmentation Codebase built upon Detectron2, which is really easy to use.
☆24Updated 10 months ago
Becomebright / ReKV
Official PyTorch Code of ReKV (ICLR'25)
☆62Updated 7 months ago
OpenGVLab / TimeSuite
[ICLR 2025] TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
☆46Updated 6 months ago
ncTimTang / AKS
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
☆120Updated 2 months ago
Show-han / Zeroshot_REC
Official code for Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions (CVPR 2024)
☆26Updated last year
zhang9302002 / ThinkingWithVideos
The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"
☆54Updated 2 weeks ago
hwjiang1510 / VQLoC
(NeurIPS 2023) Open-set visual object query search & localization in long-form videos
☆25Updated last year
mbzuai-oryx / VideoGLaMM
[CVPR 2025 🔥]A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
☆89Updated 6 months ago
minghangz / TFVTG
☆40Updated last year
Jayce1kk / SpaceVLLM
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
☆14Updated 5 months ago
franciszzj / VLPrompt
[IJCV 2025] VLPrompt-PSG: Vision-Language Prompting for Panoptic Scene Graph Generation
☆27Updated last year
GLUS-video / GLUS
[CVPR 2025] Official PyTorch Implementation of GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmenta…
☆56Updated 4 months ago
Haochen-Wang409 / ross3d
[ICCV'25] Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
☆59Updated 3 months ago
Tanveer81 / ReVisionLLM
This is the official implementation of ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos
☆32Updated 4 months ago
lovelyqian / ObjectRelator
Offical repo for ICCV25 Highlight Paper: "ObjectRelator: Enabling Cross-View Object Relation Understanding in Ego-Centric and Exo-Centric…
☆51Updated 3 weeks ago
Visual-AI / PruneVid
[ACL 2025] PruneVid: Visual Token Pruning for Efficient Video Large Language Models
☆55Updated 5 months ago
hlchen23 / VERIFIED
Official repository of NeurIPS D&B Track 2024 paper "VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understan…
☆37Updated 9 months ago
yongliang-wu / NumPro
[CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga
☆124Updated 3 weeks ago