anxiangsir/Video_Benchmark_Suite

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/anxiangsir/Video_Benchmark_Suite)

anxiangsir / Video_Benchmark_Suite

Video Benchmark Suite: Rapid Evaluation of Video Foundation Models

☆17

Alternatives and similar repositories for Video_Benchmark_Suite

Users that are interested in Video_Benchmark_Suite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

anxiangsir / V-SWIFT
View on GitHub
V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day
☆30Feb 5, 2025Updated last year
xiaoxing2001 / DeGLA
View on GitHub
[ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]
☆16Jul 15, 2025Updated last year
deepglint / Victor
View on GitHub
ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs
☆29Aug 15, 2025Updated 11 months ago
deepglint / DanQing
View on GitHub
The official repo for the DanQing dataset.
☆36Mar 25, 2026Updated 3 months ago
Multimodal-Representation-Learning-MRL / GA-DMS
View on GitHub
[EMNLP25 Main]The official code of "Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval"
☆25Mar 30, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
deepglint / UniME
View on GitHub
[ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"
☆105Dec 8, 2025Updated 7 months ago
nttstar / inswapper-512-live
View on GitHub
☆14Jan 26, 2025Updated last year
wntg / LLaMA-Omni
View on GitHub
llama-omni训练代码复现
☆72Jan 23, 2025Updated last year
deepglint / RealSyn
View on GitHub
[ACM MM2025] The official repository for the RealSyn dataset
☆39Dec 14, 2025Updated 7 months ago
deepglint / UniDoc-RL
View on GitHub
UniDoc-RL: Unified Document Understanding with Reinforcement Learning
☆16May 21, 2026Updated 2 months ago
mvp-ai-lab / mvp-engine
View on GitHub
MVP Engine - The Next-Generation Framework for Multimodal Model Training with Agents
☆24Updated this week
Barrett-python / SFC
View on GitHub
SFC: Shared Feature Calibration in Weakly Supervised Semantic Segmentation (AAAI24)
☆25Jul 2, 2024Updated 2 years ago
deepglint / unicom
View on GitHub
Large-Scale Visual Representation Model
☆702Dec 8, 2025Updated 7 months ago
chenpk00 / IS2024_stream_decoder_only_asr
View on GitHub
☆16Mar 12, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
VisionXLab / ProCLIP
View on GitHub
Official PyTorch implementation of ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder
☆25Dec 4, 2025Updated 7 months ago
logan-0623 / PG-SAM
View on GitHub
Efficient Semantic Fine-grained Prior Generation and Refinement Decoder Based on SAM for Improved Multi-organ Segmentation
☆22Mar 26, 2025Updated last year
mk-minchul / sapiensid
View on GitHub
☆26Nov 17, 2025Updated 8 months ago
NNDam / adaface-partialfc
View on GitHub
Adaface with PartialFC
☆27Apr 12, 2024Updated 2 years ago
haochen-MBZUAI / MSWAL-
View on GitHub
MSWAL
☆15Nov 7, 2025Updated 8 months ago
OpenGVLab / Docopilot
View on GitHub
[CVPR 2025] Docopilot: Improving Multimodal Models for Document-Level Understanding
☆37Jul 22, 2025Updated 11 months ago
wonjune-kang / llm-speech-summarization
View on GitHub
Prompting Large Language Models with Audio for General-Purpose Speech Summarization
☆20May 14, 2025Updated last year
leoluopy / pytorch_arcface_cosface_partialFC
View on GitHub
insightface 中 pytorch , arcface , cosface , partialFC , mix precision training中文注解版本，希望能帮助大家快速理解，并玩儿自己的项目
☆38Apr 21, 2021Updated 5 years ago
EdVince / model_zoo
View on GitHub
Recording models
☆12Sep 19, 2023Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
FeiGeChuanShu / yolov7-mask-ncnn
View on GitHub
c++ version of yolov7-mask with ncnn
☆57Aug 20, 2022Updated 3 years ago
RocketFlash / easy_metric_learning
View on GitHub
Just prepare config file and start training your metric learning model with ease
☆16May 20, 2026Updated 2 months ago
ShareLab-SII / CoMP-MM
View on GitHub
Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"
☆48Apr 3, 2025Updated last year
NVlabs / AL-SSL
View on GitHub
☆18Mar 19, 2023Updated 3 years ago
Oneflow-Inc / oneflow_face
View on GitHub
☆12Aug 10, 2022Updated 3 years ago
stillbreeze / 3D-reconstruction-Using-SfM-and-Stereo-Matching
View on GitHub
SfMEdu System from Princeton for Dense 3D Reconstruction
☆11Dec 11, 2019Updated 6 years ago
strands-project / v4r_ros_wrappers
View on GitHub
ROS wrappers for the V4R library
☆10Oct 3, 2017Updated 8 years ago
chentyjpm / ncnn_llm-mcp-sdimggen
View on GitHub
基于 ncnn 的 Stable Diffusion 推理小工具，用于给 ncnn-llm 适配“图片生成”能力（作为 MCP 工具/后端可执行程序被调用）。
☆18Dec 19, 2025Updated 7 months ago
yysu-888 / yoloe_onnxruntime
View on GitHub
yoloe model onnxruntime infer
☆26Mar 26, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
EvolvingLMMs-Lab / SkillOpt-Lite
View on GitHub
SkillOpt-Lite and HarnessOpt: Optimize your skill or harness with one line of vibe
☆116Jul 10, 2026Updated last week
cpuimage / fftw3
View on GitHub
FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and…
☆17Sep 6, 2018Updated 7 years ago
Open-Model-Initiative / imagegen-speedrun
View on GitHub
We bring the spirit of nanogpt-speedrun into the omni-modal world
☆15Jan 31, 2026Updated 5 months ago
bytedance / LVFace
View on GitHub
🔥 [ICCV 2025 Highlight] Official open-source repo for LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recogniti…
☆122Aug 21, 2025Updated 11 months ago
GaryGuTC / COMG_model
View on GitHub
[WACV 2024] Complex Organ Mask Guided Radiology Report Generation
☆43Nov 10, 2025Updated 8 months ago
deepglint / RWKV-CLIP
View on GitHub
[EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner
☆151Dec 14, 2025Updated 7 months ago
SuDIS-ZJU / Data-Quality-for-Vision-Language-Models
View on GitHub
☆35Nov 18, 2025Updated 8 months ago