eastonYi / wav2vecView external linksLinks
a simplified version of wav2vec(1.0, vq, 2.0) in fairseq
☆167Sep 21, 2020Updated 5 years ago
Alternatives and similar repositories for wav2vec
Users that are interested in wav2vec are comparing it to the libraries listed below
Sorting:
- speech to text with self-supervised learning based on wav2vec 2.0 framework☆379Nov 22, 2021Updated 4 years ago
- ☆25Mar 12, 2022Updated 3 years ago
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆96Nov 20, 2024Updated last year
- Deep Neural Pitch Extractor for Voice Conversion and TTS Training☆146Aug 22, 2022Updated 3 years ago
- Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995☆78Dec 3, 2024Updated last year
- ☆37Jun 28, 2021Updated 4 years ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆77Dec 3, 2025Updated 2 months ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Dec 6, 2022Updated 3 years ago
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆191Jul 12, 2024Updated last year
- ☆19Mar 22, 2024Updated last year
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆132Sep 25, 2023Updated 2 years ago
- High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec☆115Jun 23, 2025Updated 7 months ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆142Apr 27, 2024Updated last year
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 2 years ago
- ☆23Oct 17, 2024Updated last year
- Official implementation of the source-filter HiFiGAN vocoder☆268Jul 29, 2023Updated 2 years ago
- ☆14Aug 19, 2024Updated last year
- ☆16Nov 9, 2023Updated 2 years ago
- A differentiable version of SPTK☆192Feb 3, 2026Updated last week
- INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"☆117Jan 26, 2024Updated 2 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- speech self-supervised representations☆518Apr 27, 2023Updated 2 years ago
- This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"☆134Nov 29, 2023Updated 2 years ago
- ☆82Jan 22, 2025Updated last year
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆212Sep 19, 2024Updated last year
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)☆119Feb 7, 2024Updated 2 years ago
- This repo is text to speech with learnable audio encoder without alignment with transcript reference☆53Sep 20, 2025Updated 4 months ago
- HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis☆45Mar 2, 2021Updated 4 years ago
- HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz☆24Jan 2, 2024Updated 2 years ago
- ☆38Apr 15, 2024Updated last year
- Yin pitch estimator in PyTorch☆117Nov 7, 2022Updated 3 years ago
- [ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition☆219Jun 22, 2023Updated 2 years ago
- Official code for Wav2Seq☆97Jul 19, 2022Updated 3 years ago
- TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion☆148Jan 15, 2024Updated 2 years ago
- A sequence-to-sequence voice conversion toolkit.☆108Jul 5, 2024Updated last year
- ☆28Nov 15, 2023Updated 2 years ago
- Train the next generation of TTS systems.☆171Sep 13, 2024Updated last year
- vq-wav2vec inference☆13Dec 13, 2021Updated 4 years ago