HongyangLL / M3-JEPALinks
[ICML 2025] Repository for M3-JEPA: Multimodal Alignment via Multi-gate MoE based on the Joint-Predictive Embedding Architecture
☆17Updated 2 months ago
Alternatives and similar repositories for M3-JEPA
Users that are interested in M3-JEPA are comparing it to the libraries listed below
Sorting:
- [Preprint] UCGM: Unified Continuous Generative Models☆179Updated 8 months ago
- [CVPR 2024 Highlight] Official PyTorch implementation of "MindBridge: A Cross-Subject Brain Decoding Framework"☆119Updated last year
- [ICCV2025] TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/To…☆150Updated 6 months ago
- Official repo for UAE☆155Updated last month
- The official implementation of OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows☆122Updated 5 months ago
- [CVPR 2024] ViT-Lens: Towards Omni-modal Representations☆189Updated 11 months ago
- [NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer Designs via Grafting☆70Updated 3 weeks ago
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆55Updated 6 months ago
- Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos☆25Updated last year
- This repository is for The Power of Sound(TPoS): Audio Reactive Video Generation with Stable Diffusion (ICCV2023)☆25Updated 2 years ago
- Towards training VQ-VAE models robustly!☆91Updated 6 months ago
- [ICML 2025] Implementation of Spatial Reasoning with Denoising Models☆86Updated 6 months ago
- [CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis☆62Updated 9 months ago
- Official PyTorch Implementation of "Scalable Autoregressive Image Generation with Mamba"☆141Updated last year
- GenExam: A Multidisciplinary Text-to-Image Exam☆55Updated last month
- [ICML 2024] Compositional Image Decomposition with Diffusion Models☆53Updated last year
- UniDisc: A discrete diffusion model for joint multimodal generation, enabling controllable and efficient text-image synthesis, editing, a…☆134Updated 9 months ago
- ☆182Updated 2 months ago
- Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers" [ICCV 2025]☆99Updated 6 months ago
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆77Updated last year
- [NeurIPS 2024] Official PyTorch Implementation of "FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner"☆72Updated 3 months ago
- PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model☆27Updated last year
- [NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint Image-Feature Synthesis☆111Updated 2 months ago
- ☆41Updated 8 months ago
- Official PyTorch implementation of TokenSet.☆127Updated 10 months ago
- Explore how to get a VQ-VAE models efficiently!☆67Updated 6 months ago
- [🏆 IJCV 2025 & ACCV 2024 Best Paper Honorable Mention] Official pytorch implementation of the paper "High-Quality Visually-Guided Sound …☆27Updated 2 months ago
- The codebase of our paper "Improving the Training of Rectified Flows", NeurIPS 2024☆127Updated last year
- LVAS-Agent Code Base☆22Updated 9 months ago
- Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"☆164Updated last year