camenduru / seamless-expressive-hf
☆14Updated last year
Alternatives and similar repositories for seamless-expressive-hf
Users that are interested in seamless-expressive-hf are comparing it to the libraries listed below
Sorting:
- ☆12Updated last year
- ☆22Updated last year
- Official code of the paper: Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis.☆46Updated 8 months ago
- This is a repository that collects common audio noise reduction models, using Gradio to demonstrate the use of each model, which is very …☆37Updated 5 months ago
- Music production for silent film clips.☆22Updated 2 weeks ago
- F5-TTS 推理加速,速度提升约4倍!☆85Updated 4 months ago
- A collection of optimized utilities for text-to-audio processing, enhancing both training and inference workflows. This repository contai…☆13Updated last month
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated 8 months ago
- ☆18Updated 3 months ago
- ☆20Updated last year
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems☆81Updated last year
- flow mirror models from JZX AI Labs☆45Updated 7 months ago
- ☆39Updated last year
- ☆62Updated 9 months ago
- ☆57Updated 10 months ago
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆98Updated 4 months ago
- This project is to train an RWKV LLM for TTS generation which compatible to other TTS engine(like fish/cosy/chattts).☆74Updated last week
- Implementation of SoundtStream from the paper: "SoundStream: An End-to-End Neural Audio Codec"☆11Updated 3 months ago
- 重构GPT-SOVITS的项目,重写了部分代码,优化了webui的使用以及增加了api调用☆27Updated 5 months ago
- Open TTS models, built for streaming on the edge☆41Updated 2 months ago
- Official Code for ParrotTTS☆50Updated 7 months ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆39Updated last year
- Official implementation for FlowSep☆47Updated 4 months ago
- This repository contains code for fine-tuning the Whisper speech-to-text model.☆9Updated 2 months ago
- ☆29Updated last year
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆18Updated 2 months ago
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆15Updated last year
- ☆40Updated 3 months ago
- Unofficial implementation of DreamTalk in ComfyUI☆12Updated 9 months ago
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆71Updated 7 months ago