PolyU-ChenLab/ETBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PolyU-ChenLab/ETBench)

PolyU-ChenLab / ETBench

👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)

☆74

Alternatives and similar repositories for ETBench

Users that are interested in ETBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hshjerry / VideoEspresso
View on GitHub
[CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
☆140Jul 28, 2025Updated 11 months ago
showlab / MovieSeq
View on GitHub
[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences
☆46Mar 11, 2025Updated last year
xing0047 / cca-llava
View on GitHub
[NeurIPS 2024] Mitigating Object Hallucination via Concentric Causal Attention
☆67Aug 30, 2025Updated 10 months ago
Share14 / ShareGemini
View on GitHub
☆32Jul 29, 2024Updated last year
DCDmllm / Momentor
View on GitHub
☆81Nov 24, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
facebookresearch / stepdiff
View on GitHub
Data release for Step Differences in Instructional Video (CVPR24)
☆15Jun 19, 2024Updated 2 years ago
huangb23 / VTimeLLM
View on GitHub
[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".
☆295Jun 13, 2024Updated 2 years ago
xing0047 / rewrite
View on GitHub
[NeurIPS 2023] Rewrite Caption Semantics: Bridging Semantic Gaps for Language-Supervised Semantic Segmentation
☆21Jan 3, 2024Updated 2 years ago
Vinoground / Vinoground
View on GitHub
☆13Apr 13, 2026Updated 3 months ago
lscpku / VITATECS
View on GitHub
☆18Jul 10, 2024Updated 2 years ago
yingsen1 / UniMD
View on GitHub
UniMD: Towards Unifying Moment retrieval and temporal action Detection
☆57Jul 5, 2024Updated 2 years ago
niejiahao1998 / MMRel
View on GitHub
☆31Nov 17, 2024Updated last year
TimeMarker-LLM / TimeMarker
View on GitHub
A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
☆107Nov 28, 2024Updated last year
llyx97 / TempCompass
View on GitHub
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …
☆133Apr 4, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
appletea233 / LLaVA-ST
View on GitHub
[CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
☆84Jul 4, 2025Updated last year
TencentARC / TimeLens
View on GitHub
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
☆158Apr 27, 2026Updated 2 months ago
yeliudev / VideoMind
View on GitHub
🧠 VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning (ICLR 2026)
☆346Feb 8, 2026Updated 5 months ago
yellow-binary-tree / MMDuet
View on GitHub
Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…
☆44Feb 5, 2025Updated last year
ReXTime / ReXTime
View on GitHub
☆18Jan 26, 2026Updated 5 months ago
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 4 months ago
sudo-Boris / mr-Blip
View on GitHub
Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"
☆95Mar 9, 2025Updated last year
weihao1115 / MMLU-ProX
View on GitHub
[EMNLP 2025 Main] The official repo of MMLU-ProX benchmark.
☆29Aug 26, 2025Updated 10 months ago
solicucu / D3G
View on GitHub
☆15Oct 30, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
WHB139426 / Grounded-Video-LLM
View on GitHub
[EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
☆148Aug 21, 2025Updated 11 months ago
lntzm / MESM
View on GitHub
The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)
☆32Mar 29, 2024Updated 2 years ago
mbzuai-oryx / VideoGPT-plus
View on GitHub
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
☆293Aug 5, 2025Updated 11 months ago
Becomebright / GroundVQA
View on GitHub
Official PyTorch code of GroundVQA (CVPR'24)
☆63Sep 13, 2024Updated last year
Hon-Wong / ByteVideoLLM
View on GitHub
[ICCV 2025] Dynamic-VLM
☆28Dec 16, 2024Updated last year
Andy-Cheng / TEMPURA
View on GitHub
TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…
☆27Jun 4, 2025Updated last year
longvideobench / LongVideoBench
View on GitHub
[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
☆133Jul 27, 2024Updated last year
gyxxyg / VTG-LLM
View on GitHub
[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
☆130Dec 10, 2024Updated last year
Vision-CAIR / LongVU
View on GitHub
[ICML 2025] Official PyTorch implementation of LongVU
☆429May 8, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
bytedance / F-16
View on GitHub
F-16 is a powerful video large language model (LLM) that perceives high-frame-rate videos, which is developed by the Department of Electr…
☆40Jul 3, 2025Updated last year
yytzsy / GTP
View on GitHub
Code for the paper: "Sentence Specified Dynamic Video Thumbnail Generation"
☆33Aug 8, 2019Updated 6 years ago
mu-cai / TemporalBench
View on GitHub
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
☆40Nov 10, 2024Updated last year
ytaek-oh / vl_compo
View on GitHub
☆10Jul 5, 2024Updated 2 years ago
Wild-Cooperation-Hub / Awesome-MLLM-Reasoning-Benchmarks
View on GitHub
A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.
☆76Mar 18, 2025Updated last year
www-Ye / Time-R1
View on GitHub
R1-like Video-LLM for Temporal Grounding
☆138Jun 20, 2025Updated last year
SHI-Labs / Slow-Fast-Video-Multimodal-LLM
View on GitHub
☆29Apr 8, 2025Updated last year