Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.
☆181Sep 13, 2025Updated 6 months ago
Alternatives and similar repositories for Mixture-of-Transformers
Users that are interested in Mixture-of-Transformers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation☆39Aug 3, 2025Updated 7 months ago
- Code for "Adaptive Self-improvement LLM Agentic System for ML Library Development" (ICML 2025)☆15Jan 6, 2026Updated 2 months ago
- ☆22Sep 16, 2025Updated 6 months ago
- The official implementation of "Unleashing the Potential of Diffusion Models for End-to-End Autonomous Driving"☆72Mar 12, 2026Updated last week
- A PyTorch Deep Learning Kit☆12Apr 30, 2023Updated 2 years ago
- Official implementation of ECCV24 paper: POA☆24Aug 8, 2024Updated last year
- [CVPR2025] Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think☆23Jul 1, 2025Updated 8 months ago
- semantic tokenizer for speech and music☆21Jul 6, 2025Updated 8 months ago
- A curated collection of prompts for Grok Imagine by xAI☆25Oct 19, 2025Updated 5 months ago
- Automatically notifies you of start and completion using environment variables☆13Aug 4, 2023Updated 2 years ago
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- An open source implementation of CLIP (With TULIP Support)☆164May 14, 2025Updated 10 months ago
- Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning (AISTATS 2022 Oral)☆43Nov 10, 2022Updated 3 years ago
- Reference implementation of DecDTW in PyTorch (ICLR 2023)☆24May 29, 2023Updated 2 years ago
- ☆66Feb 4, 2026Updated last month
- ImageNet-12k subset of ImageNet-21k (fall11)☆21Jun 13, 2023Updated 2 years ago
- [EMNLP 2025 Findings] 3D-Aware Vision-Language Models Fine-Tuning with Geometric Distillation☆32Jun 12, 2025Updated 9 months ago
- HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing☆238Updated this week
- Scalable data valuation using optimal transport (ICLR 2025)☆13Jul 15, 2025Updated 8 months ago
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆30Dec 22, 2025Updated 3 months ago
- Official Code of CVPR 2025 paper "SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters"☆52Jul 13, 2025Updated 8 months ago
- ☆51Aug 22, 2025Updated 7 months ago
- Manage your project and team road maps in YAML☆15Mar 13, 2026Updated last week
- ☆30Jul 25, 2025Updated 7 months ago
- MaskFlow: Discrete Flows For Flexible and Efficient Long Video Generation☆27Mar 4, 2025Updated last year
- [NeurIPS2024] Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs☆23Oct 15, 2024Updated last year
- Dion optimizer algorithm☆456Jan 16, 2026Updated 2 months ago
- ☆190Dec 17, 2024Updated last year
- A Python implementation of an agent swarm system that works with local LLM servers. The system allows you to create multiple agents that …☆12Nov 20, 2024Updated last year
- [NeurIPS 2025] The official repository for our paper, "Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reason…☆154Sep 12, 2025Updated 6 months ago
- Multimodal RewardBench☆64Feb 21, 2025Updated last year
- UniFork: Exploring Modality Alignment for Unified Multimodal Understanding and Generation☆46Aug 26, 2025Updated 6 months ago
- Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"☆166Jan 31, 2025Updated last year
- Code release for "RoboPrompt"☆27Sep 30, 2025Updated 5 months ago
- ☆35Jun 9, 2025Updated 9 months ago
- SGAP-Net: Semantic-Guided Attentive Prototypes Network for Few-Shot Human-Object Interaction Recognition, AAAI2020.☆14Dec 15, 2020Updated 5 years ago
- This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language M…☆24Apr 27, 2025Updated 10 months ago
- Open-vocabulary Semantic Segmentation☆33Feb 16, 2024Updated 2 years ago
- CUDA, CuDNN, NVIDIA Driver, and PyTorch Installation for Ubuntu☆12Feb 27, 2025Updated last year