hanghuacs/MMComposition

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hanghuacs/MMComposition)

hanghuacs / MMComposition

☆17

Alternatives and similar repositories for MMComposition

Users that are interested in MMComposition are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yunlong10 / AVicuna
View on GitHub
[AAAI 2025] Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding
☆34Mar 21, 2025Updated last year
yunlong10 / VidComposition
View on GitHub
[CVPR 2025] VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
☆30May 10, 2025Updated last year
hanghuacs / FineCaption
View on GitHub
☆39Jun 20, 2025Updated last year
yunlong10 / Awesome-Video-LMM-Post-Training
View on GitHub
🔥🔥🔥 Latest Papers, Codes and Datasets on Video-LMM Post-Training
☆296Mar 3, 2026Updated 4 months ago
HowieHwong / Agentic-Guardian
View on GitHub
[ICLR'26] Building a Foundational Guardrail for General Agentic Systems via Synthetic Data
☆48Oct 26, 2025Updated 8 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yunlong10 / Video-R4
View on GitHub
Reinforcing Text-Rich Video Reasoning with Visual Rumination
☆28Jun 5, 2026Updated last month
yunlong10 / CAT-V
View on GitHub
[AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…
☆68Jan 27, 2026Updated 5 months ago
ytaek-oh / awesome-vl-compositionality
View on GitHub
Awesome Vision-Language Compositionality, a comprehensive curation of research papers in literature.
☆40Feb 13, 2025Updated last year
yeates / Aurora
View on GitHub
Aurora: Unified Video Editing with a Tool-Using Agent
☆58Jun 16, 2026Updated last month
jing-bi / awesome-M.LLM-reasoning
View on GitHub
☆20May 11, 2025Updated last year
McGill-NLP / AURORA
View on GitHub
Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
☆35Jun 30, 2025Updated last year
zihuixue / ProgCaptioner
View on GitHub
Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)
☆26Jul 16, 2025Updated last year
FatemehShiri / Spatial-MM
View on GitHub
☆12Jan 10, 2025Updated last year
WikiChao / DAVIS
View on GitHub
[🏆 IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound …
☆33Mar 30, 2026Updated 3 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Job-Bench / job-bench-eval
View on GitHub
Official eval scripts for JobBench
☆29Updated this week
WikiChao / FreSca
View on GitHub
[CVPR 2025 GMCV] Test-Time Frequency Scaling: Instant Frequency Control for Any Diffusion Model
☆55May 31, 2025Updated last year
BlairStanek / gpt-statutes
View on GitHub
Probe how GPT-n performs on statutory reasoning
☆10Sep 17, 2024Updated last year
oarriaga / bayesian-inverse-graphics
View on GitHub
Bayesian Inverse Graphics for Few-Shot Concept Learning
☆12Mar 16, 2025Updated last year
qwang666 / RoomTex-
View on GitHub
[ECCV24] Official code for RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting
☆32Sep 3, 2024Updated last year
tegg89 / Deep-blogs
View on GitHub
A curated lists of self-taught materials including research blogs
☆16Dec 12, 2016Updated 9 years ago
mishajw / repeng
View on GitHub
Experiments with representation engineering
☆14Feb 28, 2024Updated 2 years ago
Sanyuan-Chen / CSS_with_EETransformer
View on GitHub
Code for the ICASSP-2021 paper: Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer
☆12Sep 2, 2021Updated 4 years ago
tamangmilan / llama3
View on GitHub
Building Llama 3 from scratch using PyTorch
☆13Sep 1, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
BakeLab / Visual-Aesthetic-Benchmark
View on GitHub
☆32May 15, 2026Updated 2 months ago
TencentARC / ARC-Chapter
View on GitHub
Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
☆44Nov 19, 2025Updated 8 months ago
hltcoe / rank-k
View on GitHub
Repository for the listwise reranker Rank-K
☆16May 23, 2025Updated last year
Norman-Ou / InstantID-with-FouriScale
View on GitHub
Combined InstantID🔥 and FouriScale to generate high resolution image!
☆11Apr 3, 2024Updated 2 years ago
SaraGhazanfari / CoF
View on GitHub
Chain-of-Frames [CVPR 2026]
☆40Jul 2, 2025Updated last year
kivancgunduz / expiration-date-detection
View on GitHub
An API that detect expiration date from the product package's picture based on Deep Learning Algorithms
☆11Jun 4, 2022Updated 4 years ago
heliossun / LaCoT
View on GitHub
[NeurIPS 2025] Official code for paper: Latent Chain-of-Thought for Visual Reasoning
☆36Oct 16, 2025Updated 9 months ago
claudia-viaro / Wdss-UCLdss_research
View on GitHub
☆12Aug 31, 2022Updated 3 years ago
ljang0 / videowebarena
View on GitHub
☆14Dec 25, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
OpenGVLab / VKnowU
View on GitHub
[ECCV 2026] VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs
☆15Feb 3, 2026Updated 5 months ago
DwanZhang-AI / SePPO
View on GitHub
Code for "SePPO: Semi-Policy Preference Optimization for Diffusion Alignment."
☆18Oct 7, 2024Updated last year
skhemlani / mReasoner
View on GitHub
mReasoner is a unified computational implementation of the model theory of thinking and reasoning
☆16Aug 17, 2023Updated 2 years ago
intervention-training / int
View on GitHub
☆16Feb 4, 2026Updated 5 months ago
Fangjun-Li / SpatialLM-StepGame
View on GitHub
Codes and data for AAAI-24 paper "Advancing Spatial Reasoning in Large Language Models: An In-depth Evaluation and Enhancement Using the …
☆14Apr 23, 2024Updated 2 years ago
MinglangQiao / MVVA-Database
View on GitHub
Database of "Learning to Predict Salient Faces: A Novel Visual-Audio Saliency Model", ECCV 2020
☆13May 2, 2022Updated 4 years ago
coolbay / Re2TAL
View on GitHub
Repository for the CVPR23 paper Re^2TAL
☆13Nov 21, 2025Updated 8 months ago