What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
☆26May 16, 2025Updated 10 months ago
Alternatives and similar repositories for CAPability
Users that are interested in CAPability are comparing it to the libraries listed below
Sorting:
- ☆21Jun 15, 2023Updated 2 years ago
- [IJCAI-2024] The official code of Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition☆10Aug 10, 2025Updated 7 months ago
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆32Mar 29, 2024Updated last year
- An Arena-style Automated Evaluation Benchmark for Detailed Captioning☆58Jun 1, 2025Updated 9 months ago
- The official code of Linguistic More: Taking a Further Step toward Efficient and Accurate Scene Text Recognition (IJCAI2023)☆27Sep 3, 2023Updated 2 years ago
- CaDiCaL + neural glue variable predictions☆10Oct 21, 2020Updated 5 years ago
- Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval --ICCV2023 Oral☆91Nov 2, 2023Updated 2 years ago
- Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)☆12Mar 6, 2025Updated last year
- LLaVA-Next for STVG☆18Dec 5, 2025Updated 3 months ago
- Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach☆14Apr 2, 2025Updated 11 months ago
- Paper Reading of IMCC groups.☆17Oct 22, 2025Updated 5 months ago
- SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability☆16May 8, 2025Updated 10 months ago
- ☆14Feb 3, 2018Updated 8 years ago
- Official implementation of EgoThinker at NIPS 2025☆25Nov 25, 2025Updated 3 months ago
- Official implementation of "Graph Signal Diffusion Model for Collaborative Filtering" (SIGIR 2024)☆17May 31, 2024Updated last year
- [CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"☆35Feb 2, 2024Updated 2 years ago
- 洛谷 API 文档☆14Nov 15, 2025Updated 4 months ago
- An official implementation for MS-DETR in ACL'23☆17Jun 3, 2023Updated 2 years ago
- ICLR 2022 (Spolight): Continual Learning With Filter Atom Swapping☆16Jul 5, 2023Updated 2 years ago
- [NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"☆31Feb 22, 2026Updated last month
- This is an official implementation of our NeurIPS 2022 paper "Bridging the Gap Between Vision Transformers and Convolutional Neural Netwo…☆63Aug 20, 2025Updated 7 months ago
- MomentDiff: Generative Video Moment Retrieval from Random to Real--NeurIPS 2023☆80Nov 2, 2023Updated 2 years ago
- Code and dataset link for "DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World"☆127Oct 2, 2025Updated 5 months ago
- This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Lang…☆22Jul 3, 2024Updated last year
- ☆14Oct 14, 2019Updated 6 years ago
- Functional Regularisation for Continual Learning with Gaussian Processes☆15Oct 24, 2020Updated 5 years ago
- ☆48Feb 7, 2025Updated last year
- A Formal Verification Framework for Chisel☆19Apr 9, 2024Updated last year
- [ICCV 2025] Boosting MLLM Reasoning with Text-Debiased Hint-GRPO☆47Jul 1, 2025Updated 8 months ago
- ☆20Dec 8, 2024Updated last year
- ☆13Jun 26, 2023Updated 2 years ago
- Implementation of the paper "Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory", Ron Amit and Ron Meir, ICML 2018☆22Oct 30, 2019Updated 6 years ago
- ☆19Jun 10, 2025Updated 9 months ago
- A simple script to create a virtual camera and route deepfakelive's output stream to it using Python and OpenCV☆17Jan 2, 2023Updated 3 years ago
- ☆11Nov 12, 2018Updated 7 years ago
- [NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…☆269Nov 5, 2025Updated 4 months ago
- A PasteBin Based on Python Flask and mount on Docker☆18Aug 8, 2022Updated 3 years ago
- The tampered text detection dataset☆22Aug 23, 2023Updated 2 years ago
- stable-diffusion-webui extension that bypass to lsmith☆12Apr 28, 2023Updated 2 years ago