dvlab-research / MGM-OmniLinks
An Open-source Omni Chatbot for Long Speech and Voice Clone
☆114Updated last month
Alternatives and similar repositories for MGM-Omni
Users that are interested in MGM-Omni are comparing it to the libraries listed below
Sorting:
- ☆78Updated 4 months ago
- video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is d…☆67Updated last month
- Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.☆467Updated this week
- AudioStory: Generating Long-Form Narrative Audio with Large Language Models☆274Updated this week
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆37Updated last year
- (AAAI 2025)MUSES: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration☆38Updated 4 months ago
- Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation☆59Updated 3 months ago
- OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models