a simplified version of wav2vec(1.0, vq, 2.0) in fairseq
☆171Sep 21, 2020Updated 5 years ago
Alternatives and similar repositories for wav2vec
Users that are interested in wav2vec are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- speech to text with self-supervised learning based on wav2vec 2.0 framework☆379Nov 22, 2021Updated 4 years ago
- PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INT…☆38Feb 27, 2022Updated 4 years ago
- ☆37Jun 28, 2021Updated 4 years ago
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆97Nov 20, 2024Updated last year
- Pre-trained Wav2vec2.0 for Mandarin☆43Oct 30, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆25Mar 12, 2022Updated 4 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Dec 6, 2022Updated 3 years ago
- Implementation of the paper "wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations" in Pytorch.☆58May 19, 2023Updated 2 years ago
- Deep Neural Pitch Extractor for Voice Conversion and TTS Training☆149Aug 22, 2022Updated 3 years ago
- This repository contains the code for the paper "Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection fr…☆12Dec 19, 2025Updated 3 months ago
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆194Jul 12, 2024Updated last year
- ☆16Nov 9, 2023Updated 2 years ago
- This repo is text to speech with learnable audio encoder without alignment with transcript reference☆53Sep 20, 2025Updated 6 months ago
- Self-Supervised Speech Pre-training and Representation Learning Toolkit☆2,547Mar 12, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A library for speech data augmentation in time-domain☆685Aug 30, 2021Updated 4 years ago
- ☆23Oct 17, 2024Updated last year
- speech self-supervised representations☆517Apr 27, 2023Updated 2 years ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 3 years ago
- INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"☆117Jan 26, 2024Updated 2 years ago
- Official code for Wav2Seq☆97Jul 19, 2022Updated 3 years ago
- vq-wav2vec inference☆14Dec 13, 2021Updated 4 years ago
- A Pytorch Implementation of Transducer Model for End-to-End Speech Recognition☆239May 12, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- AcademiCodec: An Open Source Audio Codec Model for Academic Research☆670Dec 27, 2023Updated 2 years ago
- [ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition☆219Jun 22, 2023Updated 2 years ago
- Official implementation of the source-filter HiFiGAN vocoder☆271Jul 29, 2023Updated 2 years ago
- NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis☆151Feb 11, 2023Updated 3 years ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆141Apr 27, 2024Updated last year
- This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"☆133Nov 29, 2023Updated 2 years ago
- SChunk-Encoder (Transformer or Conformer) for streaming E2E ASR☆11Oct 21, 2022Updated 3 years ago
- An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.☆370Oct 12, 2021Updated 4 years ago
- ☆82Jan 22, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Wav2vec 2.0 Self-Supervised Pretraining☆60Feb 6, 2025Updated last year
- A differentiable version of SPTK☆197Mar 26, 2026Updated 3 weeks ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 5 months ago
- Official Pytorch implementation of "Large Language Models are Strong Audio-Visual Speech Recognition Learners" [ICASSP 2025] and "Mitigat…☆62Jan 18, 2026Updated 3 months ago
- A toolkit for any-to-any encoder-decoder voice conversion systems☆84Aug 10, 2023Updated 2 years ago
- HiFTNet wav/audio super-resolution 16/24 kHz to 48 kHz☆24Jan 2, 2024Updated 2 years ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆214Sep 19, 2024Updated last year