MiniMax-AI / MiniMax-AI.github.ioLinks
The official GitHub Page for MiniMax
☆62Updated 2 months ago
Alternatives and similar repositories for MiniMax-AI.github.io
Users that are interested in MiniMax-AI.github.io are comparing it to the libraries listed below
Sorting:
- Kyutai with an "eye"☆235Updated 10 months ago
- AudioStory: Generating Long-Form Narrative Audio with Large Language Models☆301Updated 4 months ago
- The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"☆71Updated 5 months ago
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆388Updated 2 weeks ago
- Official repository for "VideoPrism: A Foundational Visual Encoder for Video Understanding" (ICML 2024)☆348Updated 3 weeks ago
- OmniVinci is an omni-modal LLM for joint understanding of vision, audio, and language.☆631Updated 3 months ago
- ☆537Updated 4 months ago
- A real-time streaming conversational video system that transforms text interactions into continuous, high-fidelity video responses using …☆293Updated last month
- ☆77Updated 9 months ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆307Updated 8 months ago
- ☆247Updated last month
- ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation☆114Updated last month
- LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)☆407Updated 3 months ago
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆226Updated 3 months ago
- ☆45Updated 5 months ago
- ☆19Updated 11 months ago
- ☆346Updated 5 months ago
- ☆245Updated last month
- CursorCore: Assist Programming through Aligning Anything☆133Updated 11 months ago
- [EMNLP 2025 Demo] PresentAgent: Multimodal Agent for Presentation Video Generation☆128Updated 2 months ago
- A highly compressive and high-quality neural audio codec for speech models.☆246Updated 2 weeks ago
- ☆94Updated last year
- Find out who said what in the video.☆129Updated 2 weeks ago
- Official code of the paper: Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis.☆45Updated last year
- Official Code Repo for UniVA: Universal Video Agents☆332Updated last week
- project for skyreels-a3☆78Updated 5 months ago
- Stable-DiffCoder is a family of lightweight open-source code DLLMs(diffusion large language models) comprising base and instruct models, …☆65Updated 2 weeks ago
- ☆147Updated last month
- ☆147Updated 6 months ago
- Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D…☆37Updated last year