MCG-NJU/CaReBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MCG-NJU/CaReBench)

MCG-NJU / CaReBench

A Fine-grained Benchmark for Video Captioning and Retrieval

☆30

Alternatives and similar repositories for CaReBench

Users that are interested in CaReBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MCG-NJU / RGE
View on GitHub
Reasoning Guided Embeddings: Leveraging MLLM Reasoning for Improved Multimodal Retrieval
☆15Nov 29, 2025Updated 7 months ago
HELLORPG / CV-Framework
View on GitHub
A simple Computer Vision Framework, mainly based on PyTorch. Including distributed training, logging and so on.
☆12Dec 2, 2023Updated 2 years ago
MCG-NJU / VideoEval
View on GitHub
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
☆15Jul 31, 2025Updated 11 months ago
MCG-NJU / FreeRet
View on GitHub
[ICML2026] FreeRet: MLLMs as Training-Free Retrievers
☆22May 25, 2026Updated 2 months ago
MCG-NJU / NeuralSolver
View on GitHub
[ICML 2025] Differentiable Solver Search for Fast Diffusion Sampling
☆21Jul 7, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MCG-NJU / p-MoD
View on GitHub
[ICCV 2025] p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay
☆44Jun 26, 2025Updated last year
geekifan / zero-academic-page
View on GitHub
Zero Academic Homepage is a clean, modern and responsive theme for academic personal websites.
☆40Jun 6, 2025Updated last year
MCG-NJU / TimeLens2
View on GitHub
TimeLens2: Generalist Video Temporal Grounding with Multimodal LLMs
☆44Updated this week
MCG-NJU / SPLAM
View on GitHub
[ECCV 2024 Oral] SPLAM: Accelerating Image Generation with Sub-path Linear Approximation Model
☆24Nov 1, 2024Updated last year
MCG-NJU / APP-Net
View on GitHub
[TIP] APP-Net: Auxiliary-point-based Push and Pull Operations for Efficient Point Cloud Recognition
☆13May 15, 2023Updated 3 years ago
Video-R1 / Awesome-Multimodal-Reasoning
View on GitHub
Collections of Papers and Projects for Multimodal Reasoning.
☆108Apr 25, 2025Updated last year
FAVOR-Bench / FAVOR-Bench
View on GitHub
Accepted By The 39th Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track
☆25Nov 17, 2025Updated 8 months ago
TencentARC / TimeLens
View on GitHub
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
☆162Updated this week
MCG-NJU / ZeroI2V
View on GitHub
[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
☆23Jul 29, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
TechNomad-ds / LoVR-benchmark
View on GitHub
[WWW'2026 Oral] LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
☆16May 19, 2026Updated 2 months ago
x4Cx58x54 / vistal
View on GitHub
A visualization tool for temporal action localization (detection/segmentation).
☆13Mar 30, 2023Updated 3 years ago
MCG-NJU / AMD
View on GitHub
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
☆18Jan 11, 2026Updated 6 months ago
mbzuai-oryx / LongShOT
View on GitHub
A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos
☆21Jun 20, 2026Updated last month
leexinhao / ZeroI2V
View on GitHub
[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
☆20Jul 29, 2024Updated last year
OpenGVLab / VideoChat-R1
View on GitHub
[NIPS2025] VideoChat-R1 & R1.5: Enhancing Spatio-Temporal Perception and Reasoning via Reinforcement Fine-Tuning
☆268Oct 18, 2025Updated 9 months ago
shuheikurita / RefEgo
View on GitHub
☆13Jul 20, 2024Updated 2 years ago
XiaobuLv0626 / NJU-AMLReviewNote
View on GitHub
Nanjing University Advanced Machine Learning Review
☆31Jun 11, 2025Updated last year
MCG-NJU / JoMoLD
View on GitHub
[ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
☆27Jul 15, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MCG-NJU / VideoChat-Online
View on GitHub
[CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online
☆97Oct 7, 2025Updated 9 months ago
SaraGhazanfari / CoF
View on GitHub
Chain-of-Frames [CVPR 2026]
☆40Jul 2, 2025Updated last year
MCG-NJU / MeMOTR
View on GitHub
[ICCV 2023] MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking
☆237Oct 15, 2025Updated 9 months ago
OpenGVLab / TimeSuite
View on GitHub
[ICLR 2025] TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
☆74Apr 7, 2025Updated last year
dawnyc / ROMTrack
View on GitHub
[ICCV 2023] Robust Object Modeling for Visual Tracking, Official Implementation
☆48Jan 5, 2025Updated last year
mlvlab / DeepVideoR1
View on GitHub
[NeurIPS25] Official Implementation (Pytorch) of "DeepVideo-R1"
☆36Feb 22, 2026Updated 5 months ago
zihuixue / ProgCaptioner
View on GitHub
Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)
☆26Jul 16, 2025Updated last year
fmthoker / SEVERE-BENCHMARK
View on GitHub
☆26Aug 31, 2023Updated 2 years ago
OpenGVLab / PVC
View on GitHub
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆54Jun 12, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
linhuixiao / HiVG
View on GitHub
[ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.
☆65Nov 10, 2025Updated 8 months ago
TencentARC / ARC-Chapter
View on GitHub
Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
☆44Nov 19, 2025Updated 8 months ago
wlin-at / MAXI
View on GitHub
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)
☆31Sep 5, 2023Updated 2 years ago
Share14 / ShareGemini
View on GitHub
☆32Jul 29, 2024Updated last year
XuYi-fei / Coder-s-platform
View on GitHub
编程汇技术分享平台
☆14Mar 7, 2026Updated 4 months ago
MCG-NJU / PDPP
View on GitHub
[CVPR 2023 Hightlight] PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
☆34Aug 30, 2023Updated 2 years ago
MCG-NJU / HATReID-MOT
View on GitHub
[ECCV 2026] History-Aware Transformation of ReID Features for Multiple Object Tracking
☆36Updated this week