camenduru / FluxMusicLinks
Text-to-Music Generation with Rectified Flow Transformer
☆8Updated 10 months ago
Alternatives and similar repositories for FluxMusic
Users that are interested in FluxMusic are comparing it to the libraries listed below
Sorting:
- Text-to-Music Generation with Rectified Flow Transformer☆64Updated last month
- ☆19Updated 10 months ago
- High-performance ASR tool using Faster Whisper, supporting custom models, multi-language transcription, and real-time processing feedback…☆10Updated 8 months ago
- Site for sharing MusicGen + AudioGen Prompts and Creations☆45Updated 3 months ago
- This project includes a Python script for fine-tuning a text-to-speech (TTS) model. The script utilizes custom datasets and use CUDA for …☆13Updated 9 months ago
- (WIP) A retrain of F5-TTS on permissively-licensed data☆11Updated 3 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in MLX☆20Updated 9 months ago
- AudioLDM text to audio colab☆19Updated last year
- Explore, Install, Innovate — in 1 Click.☆27Updated this week
- ☆39Updated last year
- ☆107Updated last year
- text-to-audio-latent-diffusion☆37Updated last year
- ☆23Updated 8 months ago
- Prepare spectrograms from audio for training a Riffusion model☆15Updated 2 years ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆90Updated last month
- BeltOut: An open source pitch-perfect voice-to-voice timbre transfer model based on ChatterboxVC☆63Updated this week
- ☆11Updated last year
- A fast MP3 decoder for python, using minimp3☆29Updated 2 years ago
- ☆27Updated last year
- A text to audio pipeline using Riffusion (a finetuned stablediffusion model) and using RAVE a audio to audio AutoEncoder.☆16Updated 2 weeks ago
- 1 min voice data can also be used to train a good TTS model! (few shot voice cloning)☆26Updated last month
- Fork of AudioLDM as a TuneFlow plugin☆42Updated 2 years ago
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆20Updated 4 months ago
- [SOTA] [92% acc] 786M-8k-44L-32H multi-instrumental music transformer with true full MIDI instruments range, efficient encoding, octo-vel…☆87Updated 6 months ago
- Gradio UI for YuE☆65Updated 3 months ago
- ☆24Updated last year
- Misc. tools/scripts that I made to use for tortoise☆21Updated 10 months ago
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated last year
- ☆24Updated last year
- Build HTML artefacts with Ollama☆11Updated 7 months ago