yunlong10/VidComposition

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yunlong10/VidComposition)

yunlong10 / VidComposition

[CVPR 2025] VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?

☆30

Alternatives and similar repositories for VidComposition

Users that are interested in VidComposition are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hanghuacs / MMComposition
View on GitHub
☆17Jun 20, 2025Updated last year
jing-bi / awesome-M.LLM-reasoning
View on GitHub
☆20May 11, 2025Updated last year
yunlong10 / Awesome-AI4Animation
View on GitHub
[ICCVW 2025] This repository includes latest papers, projects and datasets on GenAI for Cel-Animation. Accepted by ICCV 2025 AISTORY Wor…
☆206Jan 13, 2026Updated 6 months ago
yunlong10 / Awesome-Video-LMM-Post-Training
View on GitHub
🔥🔥🔥 Latest Papers, Codes and Datasets on Video-LMM Post-Training
☆296Mar 3, 2026Updated 4 months ago
WikiChao / FreSca
View on GitHub
[CVPR 2025 GMCV] Test-Time Frequency Scaling: Instant Frequency Control for Any Diffusion Model
☆55May 31, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yunlong10 / CAT-V
View on GitHub
[AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…
☆68Jan 27, 2026Updated 5 months ago
ZehuaKcrissLi / GTR-Voice
View on GitHub
☆16Nov 11, 2024Updated last year
LuoXiaoHeics / Continual-Tune
View on GitHub
☆10Feb 6, 2025Updated last year
hanghuacs / V2Xum-LLM
View on GitHub
☆27Jan 4, 2025Updated last year
adobe-research / llava-score
View on GitHub
☆11Oct 2, 2024Updated last year
dawitmureja / AVE
View on GitHub
This is the official repository for our ECCV 2022 paper titled, "The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assis…
☆53Nov 28, 2022Updated 3 years ago
ljang0 / videowebarena
View on GitHub
☆14Dec 25, 2024Updated last year
PardoAlejo / MovieCuts
View on GitHub
Learning to cut end-to-end pretrained modules
☆38Apr 17, 2025Updated last year
mlvlab / VidChain
View on GitHub
Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…
☆25Jan 26, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
SilentView / LVD-2M
View on GitHub
[NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"
☆79Oct 15, 2024Updated last year
ShujinWu-0814 / ALOE
View on GitHub
Public code repo for COLING 2025 paper "Aligning LLMs with Individual Preferences via Interaction"
☆41Apr 3, 2025Updated last year
linzhiqiu / CLIP-FlanT5
View on GitHub
Training code for CLIP-FlanT5
☆31Jul 29, 2024Updated last year
google-deepmind / geckonum_benchmark_t2i
View on GitHub
GeckoNum Benchmark for T2I Model Eval.
☆15Dec 5, 2024Updated last year
VidCapBench / VidCapBench
View on GitHub
☆13May 17, 2025Updated last year
orrzohar / Video-STaR
View on GitHub
[ICLR 2025] Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
☆72Jul 10, 2024Updated 2 years ago
amitakamath / vl_text_encoders_are_bottlenecks
View on GitHub
Code and datasets for "Text encoders are performance bottlenecks in contrastive vision-language models". Coming soon!
☆11May 24, 2023Updated 3 years ago
llyx97 / TempCompass
View on GitHub
[ACL 2024 Findings] "TempCompass: Do Video LLMs Really Understand Videos?", Yuanxin Liu, Shicheng Li, Yi Liu, Yuxiang Wang, Shuhuai Ren, …
☆133Apr 4, 2025Updated last year
tsuruoka-lab / AMI-Meeting-Parallel-Corpus
View on GitHub
AMI Meeting Parallel Corpus
☆13Dec 11, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
kwanyun / LeGO_code
View on GitHub
[CVPR2024] LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example
☆13Jun 3, 2024Updated 2 years ago
grignarder / high-quality-blendshape-generation
View on GitHub
☆19Jul 8, 2024Updated 2 years ago
mutonix / Vript
View on GitHub
☆161Jan 16, 2025Updated last year
Songluchuan / Tri2plane
View on GitHub
[ECCV 2024] The repository for 'Tri$^{2}$-plane: Volumetric Avatar Reconstruction with Feature Pyramid'
☆141May 4, 2025Updated last year
LgQu / TIGeR
View on GitHub
Code for paper: Unified Text-to-Image Generation and Retrieval
☆16Jul 19, 2026Updated last week
showlab / GUI-Narrator
View on GitHub
Repository of GUI Action Narrator
☆13Apr 8, 2025Updated last year
Sanyuan-Chen / CSS_with_EETransformer
View on GitHub
Code for the ICASSP-2021 paper: Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer
☆12Sep 2, 2021Updated 4 years ago
RUC-NLPIR / VideoDeepResearch
View on GitHub
☆155Nov 17, 2025Updated 8 months ago
zjr2000 / LLMVA-GEBC
View on GitHub
Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)
☆29Jan 1, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
ChocoWu / Any2Caption
View on GitHub
This is the project for 'Any2Caption', Interpreting Any Condition to Caption for Controllable Video Generation
☆49Apr 3, 2025Updated last year
dhg-wei / TOPA
View on GitHub
(NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment
☆29Sep 27, 2024Updated last year
Baiqi-Li / NaturalBench
View on GitHub
🚀 [NeurIPS24] Make Vision Matter in Visual-Question-Answering (VQA)! Introducing NaturalBench, a vision-centric VQA benchmark (NeurIPS'2…
☆90Jun 24, 2025Updated last year
Norman-Ou / InstantID-with-FouriScale
View on GitHub
Combined InstantID🔥 and FouriScale to generate high resolution image!
☆11Apr 3, 2024Updated 2 years ago
microsoft / CollabLLM
View on GitHub
☆38May 12, 2026Updated 2 months ago
three-bee / 3d_head_stylization
View on GitHub
[ICCV 2025] Identity Preserving 3D Head Stylization with Multiview Score Distillation
☆16Jul 15, 2026Updated last week
StelaBou / Diffusion-Act
View on GitHub
☆25Sep 5, 2025Updated 10 months ago