VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation
☆17Jun 2, 2025Updated 10 months ago
Alternatives and similar repositories for VCapsBench
Users that are interested in VCapsBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Finding]"☆16Aug 27, 2025Updated 7 months ago
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated 2 months ago
- Video-Language Alignment via Spatio–Temporal Graph Transformer; ArXiv: https://arxiv.org/abs/2407.11677☆14Jul 24, 2024Updated last year
- [ICLR 2026] "VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?", Yuanxin Liu, Kun Ouyang, Haoning Wu, Yi Liu, L…☆37Jan 30, 2026Updated 2 months ago
- [ICLR 2026] MotionSight's official code implementation.☆47Feb 13, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos☆51Feb 22, 2026Updated last month
- official implementation of MGA-CLAP (ACM MM 2024)☆30Oct 25, 2024Updated last year
- Graph Neural Networks Paper List of 2019 Conferences☆20Jul 25, 2019Updated 6 years ago
- The code is tensorflow implement for focal loss for Dense Object Detection. https://arxiv.org/abs/1708.02002☆20Jun 13, 2019Updated 6 years ago
- Developer project for getting basic API integrations working in under 5 minutes☆11Jan 30, 2026Updated 2 months ago
- [CVPR 2025] GPS as a Control Signal for Image Generation☆25Mar 18, 2025Updated last year
- A Massive Multi-Discipline Lecture Understanding Benchmark☆34Nov 1, 2025Updated 5 months ago
- ECCV 2026 paper template☆41Jan 23, 2026Updated 2 months ago
- Explaining audio differences using language☆16Feb 11, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆11Mar 23, 2026Updated 2 weeks ago
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago
- Official repo for ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models☆28Mar 24, 2025Updated last year
- [IJCAI 2024] CoFInAl: Enhancing Action Quality Assessment with Coarse-to-Fine Instruction Alignment☆18Jul 16, 2024Updated last year
- ☆10Sep 25, 2024Updated last year
- DreamGaussian with 2D-GS☆12Oct 10, 2024Updated last year
- [ECCV 2024] STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment☆41Dec 27, 2023Updated 2 years ago
- [ECCV 2024] Official code repository of paper titled "Efficient 3D-Aware Facial Image Editing Via Attribute-Specific Prompt Learning"☆10Aug 2, 2024Updated last year
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Shows☆19Nov 4, 2025Updated 5 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆21Sep 24, 2025Updated 6 months ago
- ☆11Dec 28, 2023Updated 2 years ago
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆15Jun 23, 2024Updated last year
- [ICLR 2026] Many-for-Many: Unify the Training of Multiple Video and Image Generation and Manipulation Tasks☆30Feb 5, 2026Updated 2 months ago
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- This is a Uyghur language translator that supports speech-to-text in Uyghur language, machine translation to Uyghur language text, and te…☆14Nov 28, 2024Updated last year
- Different feature matching algorithms implemented in PyTorch!☆15Sep 11, 2019Updated 6 years ago
- tensorflow implementation of OHEM loss and Support the sigmoid or softmax entropy loss☆30Aug 23, 2019Updated 6 years ago
- [ACM MM 2023] PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation☆12Aug 28, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official code for CVPR2024 “VideoMAC: Video Masked Autoencoders Meet ConvNets”☆12Mar 4, 2024Updated 2 years ago
- [CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding☆49Mar 16, 2026Updated 3 weeks ago
- Deploy ChatGLM on Modelz☆16Mar 20, 2023Updated 3 years ago
- Tensorflow implementation of Global Reasoning unit (GloRe) from Graph-Based Global Reasoning Networks. GCN Network Blok☆27Jun 15, 2019Updated 6 years ago
- ☆10Aug 13, 2021Updated 4 years ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆143Aug 21, 2025Updated 7 months ago
- Official code for CustAny: Customizing Anything from A Single Example. Accepted by CVPR2025 (Oral)☆47Apr 10, 2025Updated 11 months ago