ZET-Speech / ZET-Speech-DemoView external linksLinks
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models (TTS)
☆10Mar 9, 2024Updated last year
Alternatives and similar repositories for ZET-Speech-Demo
Users that are interested in ZET-Speech-Demo are comparing it to the libraries listed below
Sorting:
- Non Parallel Voice Conversion based on VITS☆24Mar 31, 2023Updated 2 years ago
- Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)☆59Jun 20, 2024Updated last year
- PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised T…☆194Nov 9, 2022Updated 3 years ago
- ☆130Aug 19, 2024Updated last year
- Manage audio and video datasets☆33Feb 5, 2026Updated last week
- PyTorch implementation of NEUTART, a system that creates photorealistic talking avatars from an input text transcription.☆34Mar 11, 2025Updated 11 months ago
- SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification☆30Mar 24, 2023Updated 2 years ago
- A summarizer for Japanese articles (but ChatGPT is better)☆10Aug 1, 2022Updated 3 years ago
- An unofficial PyTorch implementation of Mix-Phoneme-Bert☆40Jul 10, 2023Updated 2 years ago
- ☆39Oct 1, 2023Updated 2 years ago
- Code for paper A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing☆89Sep 6, 2024Updated last year
- ☆43Aug 17, 2024Updated last year
- UnrealEngine5版VOICEVOX Engine☆13Nov 29, 2025Updated 2 months ago
- An Efficent BPE Algorithm Faster then Hugging Face Tokenizer's Implementation☆13Sep 9, 2024Updated last year
- Official Demo Page for DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer☆38Feb 17, 2025Updated last year
- Simple filter Equalizer lowpass, highpass, bandpass and bandstop☆11Apr 7, 2020Updated 5 years ago
- A virtual musical instrument built using Google MediaPipe.☆12Oct 10, 2022Updated 3 years ago
- Audio-Visual Generative Adversarial Network for Face Reenactment☆158Sep 11, 2025Updated 5 months ago
- [ECCV 2024] Official code repository of paper titled "Efficient 3D-Aware Facial Image Editing Via Attribute-Specific Prompt Learning"☆10Aug 2, 2024Updated last year
- ☆11Dec 2, 2024Updated last year
- Repository for AAAI'22 paper "MLink: Linking Black-Box Models for Collaborative Multi-Model Inference".☆11Oct 25, 2023Updated 2 years ago
- HyperCUT: Video Sequence from a Single Blurry Image using Unsupervised Ordering (CVPR'23)☆13Nov 4, 2025Updated 3 months ago
- Diffusion-based Speech Enhancement: Demonstration of Performance and Generalization☆11Dec 21, 2024Updated last year
- Webpage of "Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer"☆11Jul 2, 2024Updated last year
- MIR conference deadline countdowns☆10Feb 4, 2026Updated last week
- AutoHotKey script that utilize your (probably) useless CapsLock as Magic Fn, available for pretty much every keyborard.☆10Jun 30, 2022Updated 3 years ago
- ☆13Sep 25, 2024Updated last year
- This repo is for CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering.☆14Mar 6, 2024Updated last year
- Format to store media files and annotations☆12Feb 5, 2026Updated last week
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 10 months ago
- Transcripts and segmentation for the Blizzard 2013 audiobooks also known as the Lessac or Blizzard 2013 dataset.☆45Nov 13, 2019Updated 6 years ago
- Dataset and model in the paper "SciXGen: A Scientific Paper Dataset for Context-Aware Text Generation"☆13Feb 14, 2022Updated 4 years ago
- Code for the paper "Modeling Information Change in Science Communication with Semantically Matched Paraphrases" from EMNLP 2022☆13Oct 20, 2022Updated 3 years ago
- Emofilt is a program to simulate emotional arousal with speech synthesis based on the free-for-non-commercial-use MBROLA synthesis engine…☆14Mar 17, 2022Updated 3 years ago
- Pytorch Text GAN for lyrics generation☆10Apr 13, 2019Updated 6 years ago
- Official implementation of INTERSPECCH 2022 Radio2Speech: High Quality Speech Recovery from Radio Frequency Signals☆16Sep 19, 2025Updated 4 months ago
- calculate bhattacharyya distance based on zero cross rate feature between different Gaussian model for speech emotion recognition. corpus…☆11Oct 17, 2018Updated 7 years ago
- ☆12Jul 18, 2017Updated 8 years ago
- A simple python package to stretch audio files and change their speed☆12Jan 16, 2026Updated last month