google-deepmind / neptune

☆57

Alternatives and similar repositories for neptune

Users that are interested in neptune are comparing it to the libraries listed below

Sorting:

orrzohar / Video-STaR
[ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
☆62Updated 10 months ago
Becomebright / GroundVQA
Official PyTorch code of GroundVQA (CVPR'24)
☆61Updated 8 months ago
CeeZh / LLoVi
Official implementation for "A Simple LLM Framework for Long-Range Video Question-Answering"
☆96Updated 6 months ago
Ziyang412 / VideoTree
Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
☆111Updated 2 months ago
kkahatapitiya / LangRepo
Language Repository for Long Video Understanding
☆31Updated 11 months ago
bigai-nlco / VideoLLaMB
Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
☆67Updated 2 months ago
md-mohaiminul / BIMBA
☆10Updated last month
zeyofu / BLINK_Benchmark
This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…
☆124Updated 10 months ago
patrick-tssn / VideoHallucer
VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)
☆30Updated last month
llyx97 / TempCompass
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …
☆111Updated last month
EvolvingLMMs-Lab / VideoMMMU
☆44Updated last month
AtsuMiyai / UPD
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models
☆77Updated this week
longvideobench / LongVideoBench
[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
☆97Updated 9 months ago
twelvelabs-io / video-embeddings-evaluation-framework
Pytorch implementation of Twelve Labs' Video Foundation Model evaluation framework & open embeddings
☆25Updated 8 months ago
zyayoung / Awesome-Video-LLMs
Explore VLM-Eval, a framework for evaluating Video Large Language Models, enhancing your video analysis with cutting-edge AI technology.
☆34Updated last year
ruili33 / TPO
☆32Updated 3 months ago
facebookresearch / HierVL
[CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings
☆46Updated last year
chili-lab / SPORTU
[ICLR2025] SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models
☆14Updated 2 months ago
egoschema / EgoSchema
☆90Updated 4 months ago
OpenGVLab / TPO
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
☆50Updated 4 months ago
amazon-science / QA-ViT
☆65Updated 10 months ago
WHB139426 / Grounded-Video-LLM
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
☆107Updated last month
yellow-binary-tree / MMDuet
Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…
☆31Updated 3 months ago
Hon-Wong / VoRA
[Fully open] [Encoder-free MLLM] Vision as LoRA
☆161Updated last month
mlvlab / vid-TLDR
Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".
☆47Updated last year
McGill-NLP / AURORA
Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
☆28Updated 4 months ago
qirui-chen / MultiHop-EgoQA
[AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
☆24Updated last month
DCDmllm / Momentor
☆71Updated 5 months ago
Yui010206 / CREMA
[ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
☆45Updated 3 months ago
V-STaR-Bench / V-STaR
Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
☆22Updated last month