[Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
☆33Feb 6, 2025Updated last year
Alternatives and similar repositories for MMTrail
Users that are interested in MMTrail are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆36Feb 6, 2025Updated last year
- [Arxiv 2025] Official code for T-REX: Mixture-of-Rank-One-Experts with semantic-aware Intuition for Multi-task Large Language Model Finet…☆17May 16, 2025Updated 10 months ago
- [Arxiv2022] Interpreting Class Conditional GANs with Channel Awareness☆17Apr 4, 2022Updated 3 years ago
- ☆16Dec 12, 2023Updated 2 years ago
- On Path to Multimodal Generalist: General-Level and General-Bench☆18Jul 11, 2025Updated 8 months ago
- This project is based on the [LTX-Video](https://github.com/Lightricks/LTX-Video) algorithm of the diffusers and optimized and accelerate…☆13Dec 31, 2024Updated last year
- Music production for silent film clips.☆32Apr 30, 2025Updated 10 months ago
- ☆19Aug 11, 2025Updated 7 months ago
- 🕹️ Explore cutting-edge techniques in game generation☆66Mar 16, 2026Updated last week
- A dataset for Audio-Visual Sound Event Detection in Movies☆26Jan 23, 2023Updated 3 years ago
- [IROS 2021] Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention"☆35Aug 21, 2021Updated 4 years ago
- Motion-conditional image animation for video editing☆20Dec 2, 2023Updated 2 years ago
- [NeurIPS 2025] PyTorch Implementation of "LEDiT: Your Length-Extrapolatable Diffusion Transformer without Positional Encoding"☆25Oct 27, 2025Updated 4 months ago
- Code for "CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning"☆33Mar 26, 2025Updated 11 months ago
- ☆16Sep 29, 2025Updated 5 months ago
- ☆26Dec 16, 2024Updated last year
- ☆14Oct 16, 2023Updated 2 years ago
- Ripperdoc is an open-source, extensible AI coding agent that runs in your terminal☆52Mar 14, 2026Updated last week
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated 2 months ago
- ☆32May 3, 2024Updated last year
- VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation☆86Sep 12, 2024Updated last year
- The repo host the code and model of MAViL.☆45Jul 24, 2023Updated 2 years ago
- ☆62Jun 15, 2025Updated 9 months ago
- Implementation of MathReader, Text-to-Speech for Mathematical Documents☆27Sep 23, 2025Updated 6 months ago
- A Framework for Symbolic MUsic Graph Explanations☆10Jul 30, 2025Updated 7 months ago
- ☆14Oct 7, 2021Updated 4 years ago
- Simple SSH workspace to connect to your running job.☆10Jul 16, 2018Updated 7 years ago
- Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning☆44Jul 2, 2025Updated 8 months ago
- ☆122Jun 7, 2025Updated 9 months ago
- ☆85Dec 4, 2022Updated 3 years ago
- ☆12Jun 1, 2024Updated last year
- MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning [NeurIPS 2025 Poster]☆23Dec 10, 2025Updated 3 months ago
- ☆11Mar 24, 2021Updated 5 years ago
- Repo for "Centaur: Robust Multimodal Fusion for Human Activity Recognition"☆10Jan 9, 2024Updated 2 years ago
- A gpu accelerated neural network Rust crate.☆15Apr 17, 2023Updated 2 years ago
- Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation" in NeurIPS…☆14Dec 9, 2021Updated 4 years ago
- Fast Image Restoration with Multi-bin Trainable Linear Units.☆11Dec 23, 2019Updated 6 years ago
- This is a boilerplate project for building mobile applications using Expo, React, and Redux. It provides a solid foundation for creating …☆12Apr 6, 2025Updated 11 months ago
- Repository for "Training Audio Captioning Models without Audio"☆10Sep 26, 2023Updated 2 years ago