asappresearch / wav2seqView external linksLinks
Official code for Wav2Seq
☆97Jul 19, 2022Updated 3 years ago
Alternatives and similar repositories for wav2seq
Users that are interested in wav2seq are comparing it to the libraries listed below
Sorting:
- ICASSP 2023 Accepted☆189May 6, 2024Updated last year
- ☆55Jan 13, 2023Updated 3 years ago
- Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech (INTERSPEECH 2022)☆121Jan 24, 2023Updated 3 years ago
- An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.☆367Oct 12, 2021Updated 4 years ago
- Torch-based tool for quantizing high-dimensional vectors using additive codebooks☆54May 25, 2022Updated 3 years ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆80Sep 27, 2023Updated 2 years ago
- ☆14Jun 12, 2015Updated 10 years ago
- LoRA-based phoneme/prosody control for LLM-based TTS with no G2P - Lightweight adapter for edit and control the target language's phoneme…☆23Aug 14, 2025Updated 5 months ago
- ☆76Oct 25, 2021Updated 4 years ago
- An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-S…☆415Aug 29, 2023Updated 2 years ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆174Jun 9, 2023Updated 2 years ago
- Dataset release for Emotional TTS in Indian Accent☆40Sep 2, 2022Updated 3 years ago
- (Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT☆40Aug 29, 2024Updated last year
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆90Jun 9, 2022Updated 3 years ago
- Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs☆77Dec 3, 2025Updated 2 months ago
- ☆64May 23, 2022Updated 3 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆52Dec 6, 2022Updated 3 years ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆68Jan 5, 2026Updated last month
- A toolkit for Spoken Language Understanding Evaluation (SLUE) benchmark. Refer paper https://arxiv.org/abs/2111.10367 for more details. O…☆66Feb 26, 2024Updated last year
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆16Mar 28, 2023Updated 2 years ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆63May 19, 2023Updated 2 years ago
- NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling☆37May 25, 2021Updated 4 years ago
- Tracking the progress in end-to-end speech translation☆261Oct 25, 2023Updated 2 years ago
- Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…☆83Jan 7, 2023Updated 3 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆45May 25, 2021Updated 4 years ago
- ☆13Sep 25, 2024Updated last year
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 2 years ago
- Pitch-shift audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.☆140Sep 25, 2024Updated last year
- multilingual speech aligner☆76Nov 19, 2023Updated 2 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆345May 15, 2024Updated last year
- [ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…☆32Apr 8, 2022Updated 3 years ago
- Making Espnet easier to use☆54Apr 9, 2021Updated 4 years ago
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆77Jul 16, 2023Updated 2 years ago
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆33Jul 31, 2024Updated last year
- ☆80Aug 8, 2025Updated 6 months ago
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration…☆328Sep 24, 2022Updated 3 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- ☆11May 7, 2022Updated 3 years ago
- Official implementation of DualCycleGAN for nonparallel audio super resolution☆53Nov 1, 2022Updated 3 years ago