Video Benchmark Suite: Rapid Evaluation of Video Foundation Models
☆15Jan 10, 2025Updated last year
Alternatives and similar repositories for Video_Benchmark_Suite
Users that are interested in Video_Benchmark_Suite are comparing it to the libraries listed below
Sorting:
- [ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]☆15Jul 15, 2025Updated 7 months ago
- [EMNLP25 Main]The official code of "Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval"☆20Sep 12, 2025Updated 5 months ago
- ☆14Jan 26, 2025Updated last year
- ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs☆28Aug 15, 2025Updated 6 months ago
- The official repo for the DanQing dataset.☆29Jan 16, 2026Updated last month
- [ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"☆103Dec 8, 2025Updated 2 months ago
- ☆21Nov 17, 2025Updated 3 months ago
- llama-omni训练代码复现☆74Jan 23, 2025Updated last year
- [CVPRW 2024] LaPA: Latent Prompt Assist Model For Medical Visual Question Answering☆25Apr 24, 2025Updated 10 months ago
- SFC: Shared Feature Calibration in Weakly Supervised Semantic Segmentation (AAAI24)☆25Jul 2, 2024Updated last year
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆45Apr 3, 2025Updated 10 months ago
- Adaface with PartialFC☆28Apr 12, 2024Updated last year
- Margin-based Vision Transformer☆66Nov 28, 2025Updated 3 months ago
- Just prepare config file and start training your metric learning model with ease☆16Apr 2, 2024Updated last year
- [WACV 2024] Complex Organ Mask Guided Radiology Report Generation☆43Nov 10, 2025Updated 3 months ago
- Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding (CVPR 2025 Oral)☆36Nov 28, 2025Updated 3 months ago
- paddle特征学习教程☆11Apr 24, 2019Updated 6 years ago
- Large-Scale Visual Representation Model☆704Dec 8, 2025Updated 2 months ago
- A personal self-driving car project for the purpose of hands-on with OpenCV and deep learning.☆10Aug 6, 2020Updated 5 years ago
- Experimental Docker file for building a replicable VisualSFM environment☆10Jan 16, 2015Updated 11 years ago
- unity sdk for rendering, tracking, input, interaction, mixed reality, platform services☆17Dec 18, 2025Updated 2 months ago
- UIE(Universal Information Extraction) infer by ncnn☆15Sep 22, 2024Updated last year
- A softeware for image based building modeling.☆15Nov 26, 2014Updated 11 years ago
- Stoner Python module repository☆20Dec 21, 2025Updated 2 months ago
- ⚔️ [ICLR 2026] Official code of "Search Arena: Analyzing Search-Augmented LLMs".☆49Updated this week
- A c++ implementation of an affine invariant feature sampling technique used in ASIFT☆11Aug 16, 2019Updated 6 years ago
- ☆12Aug 10, 2022Updated 3 years ago
- 基于 OpenCV dnn 实现的 MTCNN 人脸检测器☆13Mar 31, 2019Updated 6 years ago
- Recording models☆12Sep 19, 2023Updated 2 years ago
- a simple pytorch implementation of diffusiom model☆13Mar 20, 2023Updated 2 years ago
- 基于netty开发的rpc远程调用框架☆12Jan 22, 2019Updated 7 years ago
- Tools for the Generation and Visualization of Large-scale Three-dimensional Reconstructions from Image Data.☆18May 8, 2018Updated 7 years ago
- Reproduce ResNet-v3(Aggregated Residual Transformations for Deep Neural Network)☆13Sep 6, 2017Updated 8 years ago
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆21Oct 8, 2024Updated last year
- ☆27Updated this week
- Includes drivers for Amscope camera and simple UI for time lapse imagery☆17Jun 27, 2022Updated 3 years ago
- a tiny project to test the effectiveness of video QA through RAG techniques and multimodal LLMs☆15Jun 2, 2024Updated last year
- [EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner☆153Dec 14, 2025Updated 2 months ago
- c++ version of yolov7-mask with ncnn☆59Aug 20, 2022Updated 3 years ago