camenduru / FluxMusic
Text-to-Music Generation with Rectified Flow Transformer
☆8Updated 7 months ago
Alternatives and similar repositories for FluxMusic:
Users that are interested in FluxMusic are comparing it to the libraries listed below
- Text-to-Music Generation with Rectified Flow Transformer☆61Updated 7 months ago
- ☆18Updated 7 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆19Updated 6 months ago
- ☆13Updated last month
- Open TTS models, built for streaming on the edge☆39Updated last month
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in MLX☆20Updated 6 months ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated last year
- Build HTML artefacts with Ollama☆11Updated 4 months ago
- Gradio UI for YuE☆39Updated 2 weeks ago
- High-performance ASR tool using Faster Whisper, supporting custom models, multi-language transcription, and real-time processing feedback…☆10Updated 5 months ago
- ☆14Updated 10 months ago
- ☆22Updated 6 months ago
- ☆39Updated 11 months ago
- AudioLDM text to audio colab☆19Updated last year
- Prepare spectrograms from audio for training a Riffusion model☆15Updated 2 years ago
- YuE with mp3 extend, exllama and GUI☆46Updated 2 months ago
- ☆11Updated last year
- ☆28Updated last year
- Cog wrapper for collabora/WhisperSpeech☆24Updated last year
- An open source real-time AI inference engine for seamless scaling☆18Updated last week
- ☆15Updated 3 months ago
- VoiceStar: Robust, Duration-controllable TTS that can Extrapolate☆104Updated 2 weeks ago
- Auto-Video maker handling many AI's☆10Updated last year
- Let's try and finetune the OpenAI consistency decoder to work for SDXL☆24Updated last year
- ☆24Updated 10 months ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆27Updated 6 months ago
- Proteus is an experimental platform that combines the power of Large Language Models with the Genesis physics engine☆21Updated 4 months ago
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆14Updated last month
- ☆13Updated last year
- Music production for silent film clips.☆21Updated last month