lucidrains/NWT-pytorch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/lucidrains/NWT-pytorch)

lucidrains / NWT-pytorch

Implementation of NWT, audio-to-video generation, in Pytorch

☆92

Alternatives and similar repositories for NWT-pytorch

Users that are interested in NWT-pytorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lucidrains / token-shift-gpt
View on GitHub
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing
☆49Jan 27, 2022Updated 4 years ago
lucidrains / local-attention-flax
View on GitHub
Local Attention - Flax module for Jax
☆22May 26, 2021Updated 5 years ago
keonlee9420 / WaveGrad2
View on GitHub
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
☆68Aug 3, 2021Updated 4 years ago
cschaefer26 / StyleMelGAN
View on GitHub
☆10Apr 8, 2024Updated 2 years ago
ajinkyakulkarni14 / ERISHA
View on GitHub
ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for…
☆44Dec 17, 2020Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
mt-upc / ZeroSwot
View on GitHub
Pushing the Limits of Zero-shot End-to-End Speech Translation
☆25Dec 12, 2024Updated last year
Edresson / SC-GlowTTS
View on GitHub
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech Model
☆107Sep 10, 2021Updated 4 years ago
cnlinxi / tpse_tacotron2
View on GitHub
TPSE-GST Tacotron2
☆14May 1, 2019Updated 7 years ago
speechnovateur / languagecodec_tmp
View on GitHub
Temporary anonymous version
☆22Mar 20, 2024Updated 2 years ago
zhangzjn / APB2FaceV2
View on GitHub
An improved version of APB2Face: Real-Time Audio-Guided Multi-Face Reenactment
☆84Oct 7, 2021Updated 4 years ago
revsic / torch-retriever-vc
View on GitHub
PyTorch implementation of Retriever: Learning Content-Style Representation
☆12Jan 27, 2023Updated 3 years ago
lucidrains / mlp-gpt-jax
View on GitHub
A GPT, made only of MLPs, in Jax
☆59Jun 23, 2021Updated 5 years ago
mutiann / neural-lexicon-reader
View on GitHub
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
☆21Jul 25, 2022Updated 4 years ago
lucidrains / retrieval-augmented-ddpm
View on GitHub
Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch
☆66May 5, 2022Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
W-Wu / ERC-SLT22
View on GitHub
Code for "Distribution-based Emotion Recognition in Conversation"
☆18Feb 6, 2023Updated 3 years ago
JCBrouwer / maua-style
View on GitHub
Neural style transfer
☆21Jul 29, 2021Updated 5 years ago
lucidrains / isab-pytorch
View on GitHub
An implementation of (Induced) Set Attention Block, from the Set Transformers paper
☆70Jun 8, 2026Updated last month
wentaozhu / speechnas
View on GitHub
SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification
☆30Mar 24, 2023Updated 3 years ago
rhoposit / icassp2021
View on GitHub
☆15May 8, 2021Updated 5 years ago
rishikksh20 / PPSpeech
View on GitHub
PPSpeech: Phrase based Parallel End-to-End TTS System
☆35Aug 31, 2020Updated 5 years ago
light1726 / BetaVAE_VC
View on GitHub
Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"
☆43Apr 10, 2023Updated 3 years ago
y-ren16 / TiCodec
View on GitHub
☆81Aug 11, 2025Updated 11 months ago
MWM-io / nansypp
View on GitHub
Unofficial implementation of NANSY++ in Pytorch Lightning
☆50Mar 11, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
fluxions-ai / stftvae
View on GitHub
Inference for the STFT-VAE continuous audio codec (24kHz, 3.125Hz latent)
☆43Jul 12, 2026Updated 2 weeks ago
uniaudio666 / UniAudio
View on GitHub
The official source code of UniAudio
☆97Mar 29, 2024Updated 2 years ago
dr-pato / SSGD
View on GitHub
Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"
☆15Dec 22, 2022Updated 3 years ago
Wendison / FCL-taco2
View on GitHub
Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021
☆41Jul 17, 2021Updated 5 years ago
walker-hyf / ECSS
View on GitHub
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling (Accepted by AAAI'2024)
☆59Jun 20, 2024Updated 2 years ago
bfs18 / armel
View on GitHub
poorman's ar-dit tts
☆45Dec 31, 2025Updated 6 months ago
P1ping / TokAN-Legacy
View on GitHub
☆27Jun 22, 2026Updated last month
LEEYOONHYUNG / BVAE-TTS
View on GitHub
Official implementation of BVAE-TTS
☆173Sep 26, 2022Updated 3 years ago
KrishnaDN / BERTphone
View on GitHub
Implementation of the paper "BERTphone: Phonetically-aware Encoder Representations for Utterance-level Speaker and Language Recognition"
☆17Dec 10, 2020Updated 5 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
b04901014 / FG-transformer-TTS
View on GitHub
Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.
☆90Mar 5, 2022Updated 4 years ago
keonlee9420 / VAENAR-TTS
View on GitHub
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
☆74Aug 3, 2021Updated 4 years ago
vliu15 / adversarial-tts
View on GitHub
End-to-end Text-to-Speech with Generative Adversarial Networks
☆20Feb 6, 2021Updated 5 years ago
Aria-K-Alethia / laughter-synthesis
View on GitHub
Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…
☆77Jul 16, 2023Updated 3 years ago
maum-ai / sane-tts
View on GitHub
SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech
☆11Jun 30, 2023Updated 3 years ago
lucidrains / panoptic-transformer
View on GitHub
Another attempt at a long-context / efficient transformer by me
☆38Apr 11, 2022Updated 4 years ago
lucidrains / differentiable-SDF-pytorch
View on GitHub
Implementation of Differentiable Sign-Distance Function Rendering - in Pytorch
☆70May 9, 2022Updated 4 years ago