GXYM/VCapsBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/GXYM/VCapsBench)

GXYM / VCapsBench

VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation

☆20

Alternatives and similar repositories for VCapsBench

Users that are interested in VCapsBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

VidCapBench / VidCapBench
View on GitHub
☆13May 17, 2025Updated last year
adobe-research / llava-score
View on GitHub
☆11Oct 2, 2024Updated last year
leoisufa / ICVE
View on GitHub
[Preprint 2025] ICVE: In-Context Learning with Unpaired Clips for Instruction-based Video Editing
☆25Jun 2, 2026Updated last month
robo-alex / DreamDance
View on GitHub
DreamDance: Personalized Text-to-video Generation by Combining Text-to-Image Synthesis and Motion Transfer
☆14Dec 16, 2022Updated 3 years ago
Espere-1119-Song / Video-MMLU
View on GitHub
A Massive Multi-Discipline Lecture Understanding Benchmark
☆34Apr 20, 2026Updated 3 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Jyxarthur / shot-by-shot
View on GitHub
[ICCV 2025] Official Implementation of "Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation". Junyu Xie, Tengda H…
☆24May 16, 2026Updated 2 months ago
magcil / movie_shot_classification_dataset
View on GitHub
A dataset with classified film shots
☆11Aug 8, 2022Updated 3 years ago
llyx97 / video_reason_bench
View on GitHub
[ICLR 2026] "VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?", Yuanxin Liu, Kun Ouyang, Haoning Wu, Yi Liu, L…
☆41Jan 30, 2026Updated 5 months ago
wenhaochai / aurora
View on GitHub
[ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
☆147Jun 4, 2025Updated last year
Video-MAC / VideoMAC
View on GitHub
Official code for CVPR2024 “VideoMAC: Video Masked Autoencoders Meet ConvNets”
☆16May 12, 2026Updated 2 months ago
cg1177 / Recursive-Multimodal-Agent
View on GitHub
☆19Jul 1, 2026Updated 2 weeks ago
TIGER-AI-Lab / VideoScore2
View on GitHub
Automatic Metric for Evaluating Generated Videos
☆48Dec 8, 2025Updated 7 months ago
google-deepmind / wyd-benchmark
View on GitHub
☆28Mar 3, 2025Updated last year
wenhaochai / STEVE
View on GitHub
[ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment
☆41Dec 27, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
showlab / videogui
View on GitHub
[NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos
☆53Feb 22, 2026Updated 4 months ago
wenhaochai / claude-plugins
View on GitHub
Personal Claude Code plugin marketplace
☆16Jul 4, 2026Updated 2 weeks ago
Ming-er / MGA-CLAP
View on GitHub
official implementation of MGA-CLAP (ACM MM 2024)
☆29Oct 25, 2024Updated last year
wenhaochai / PoseDA
View on GitHub
[ICCV 2023] Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation
☆24Aug 26, 2023Updated 2 years ago
NJU-PCALab / MotionSight
View on GitHub
[ICLR 2026] MotionSight's official code implementation.
☆48Apr 24, 2026Updated 2 months ago
limbo0000 / mtm
View on GitHub
Official implementation of MTM
☆21Aug 30, 2023Updated 2 years ago
wangyuchi369 / RICO
View on GitHub
Official implementation of the paper: [EMNLP 2025] RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruct…
☆21Dec 9, 2025Updated 7 months ago
TIGER-AI-Lab / VISTA
View on GitHub
The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]
☆20Feb 27, 2025Updated last year
daeunni / Video-Skill-CoT
View on GitHub
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Findings]"
☆18Aug 27, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
AVoCaDO-Captioner / AVoCaDO
View on GitHub
https://avocado-captioner.github.io/
☆37Oct 16, 2025Updated 9 months ago
TerminologyHub / termhub-in-5-minutes
View on GitHub
Developer project for getting basic API integrations working in under 5 minutes
☆11May 22, 2026Updated last month
zlab-princeton / UEval
View on GitHub
UEval: A Benchmark for Unified Multimodal Generation
☆24Apr 20, 2026Updated 3 months ago
CeeZh / SILVR
View on GitHub
Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"
☆19Jan 18, 2026Updated 6 months ago
amitakamath / vl_text_encoders_are_bottlenecks
View on GitHub
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11May 24, 2023Updated 3 years ago
nishadsinghi / sc-genrm-scaling
View on GitHub
[COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…
☆15Oct 31, 2025Updated 8 months ago
adobe-research / vaw_dataset
View on GitHub
This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in th…
☆72Jul 22, 2022Updated 3 years ago
mutonix / Vript
View on GitHub
☆160Jan 16, 2025Updated last year
hithqd / UniM-OV3D
View on GitHub
☆21Apr 17, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
VisionXLab / FIRM-Reward
View on GitHub
Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation
☆40Mar 13, 2026Updated 4 months ago
sail-sg / Video-Next-Event-Prediction
View on GitHub
☆28Aug 9, 2025Updated 11 months ago
mira-space / MiraData
View on GitHub
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
☆528Sep 2, 2024Updated last year
ZiyiZhang27 / sdpo
View on GitHub
[IEEE TPAMI] Code for the paper "Aligning Few-Step Diffusion Models with Dense Reward Difference Learning"
☆22Feb 25, 2026Updated 4 months ago
Cooperx521 / ScaleCap
View on GitHub
(ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’
☆60Jan 26, 2026Updated 5 months ago
ali-vilab / ChatDiT
View on GitHub
☆53Dec 20, 2024Updated last year
silent-commit / CLEAR
View on GitHub
CLEAR: Context-Aware Learning with End-to-End Mask-Free Inference for Adaptive Video Subtitle Removal
☆20May 25, 2026Updated last month