webaverse / LJSpeechTools
Tools to isolate speaker and transcribe unstructured audio clips
☆11Updated 2 years ago
Alternatives and similar repositories for LJSpeechTools:
Users that are interested in LJSpeechTools are comparing it to the libraries listed below
- Streamlit app to visualize and edit TTS datasets☆14Updated 3 years ago
- ☆21Updated last year
- Majesty Diffusion by @Dango233 and @apolinario (@multimodalart)☆25Updated 2 years ago
- Lyra V2 (SoundStream) running in the browser☆19Updated last year
- ☆20Updated 3 years ago
- Real-time end-to-end singing voice convertion☆19Updated 2 months ago
- Blender Keyframe Exporter for AI Animation☆13Updated 2 years ago
- Finally, some decent sample sentences☆22Updated last year
- NeMo: a toolkit for conversational AI☆9Updated this week
- ☆35Updated 2 years ago
- text-to-audio-latent-diffusion☆37Updated last year
- KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a l…☆22Updated 6 months ago
- BEGANSing - Korean SVS + SVC + AudioSR☆12Updated 11 months ago
- ☆23Updated last year
- Repo for storing the files I use to make animations with big-sleep, deep-daze, and VQGAN + CLIP.☆16Updated 3 years ago
- This contains the Flax model of min(DALL·E) and code for converting it to PyTorch☆46Updated 2 years ago
- The original weights of some Caffe models, ported to PyTorch.☆11Updated 3 years ago
- Hifi-like Vocoder implemented in PyTorch☆13Updated 2 years ago
- Windows compatible code for the paper "Jukebox: A Generative Model for Music"☆13Updated 2 years ago
- A quick test using a Stable Diffusion server and Godot 4☆11Updated last year
- An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io☆16Updated 9 months ago
- A neural network based file sorter. Trains an autoencoder to sort images or audio based on the similarity of their encodings, or uses the…☆29Updated last year
- ☆8Updated 2 years ago
- List of Podcast Feeds using iTunes API and script to download 6,000,000~ hours of English speech.☆9Updated last year
- Generate LoRA Data from Blender Renders and more!☆12Updated last year
- CLIP and PASTE: Using AI to Create Photo Collages from Text Prompts☆29Updated 2 years ago
- Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP☆39Updated 2 years ago
- Non Parallel Voice Conversion based on VITS☆24Updated last year
- ☆10Updated 2 months ago
- A simple voice conversion tool☆17Updated 2 years ago