asappresearch/wav2seq

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/asappresearch/wav2seq)

asappresearch / wav2seq

Official code for Wav2Seq

☆97

Alternatives and similar repositories for wav2seq

Users that are interested in wav2seq are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

maum-ai / phaseaug
View on GitHub
ICASSP 2023 Accepted
☆191May 6, 2024Updated 2 years ago
neosapience / editts
View on GitHub
Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech (INTERSPEECH 2022)
☆122Jan 24, 2023Updated 3 years ago
MiniXC / LightningFastSpeech2
View on GitHub
☆55Jan 13, 2023Updated 3 years ago
AbrahamSanders / codec-bpe
View on GitHub
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
☆76Dec 3, 2025Updated 7 months ago
RF5 / transfusion-asr
View on GitHub
Transcribing Speech with Multinomial Diffusion, training code and models.
☆80Sep 27, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
rishikksh20 / NU-Wave-pytorch
View on GitHub
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
☆37May 25, 2021Updated 5 years ago
tts-tutorial / icassp2022
View on GitHub
☆64May 23, 2022Updated 4 years ago
b04901014 / UUVC
View on GitHub
Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…
☆83Jan 7, 2023Updated 3 years ago
danpovey / quantization
View on GitHub
Torch-based tool for quantizing high-dimensional vectors using additive codebooks
☆54May 25, 2022Updated 4 years ago
vectominist / MiniASR
View on GitHub
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
☆53Dec 6, 2022Updated 3 years ago
facebookresearch / CPC_audio
View on GitHub
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.
☆374Oct 12, 2021Updated 4 years ago
asappresearch / slue-toolkit
View on GitHub
A toolkit for Spoken Language Understanding Evaluation (SLUE) benchmark. Refer paper https://arxiv.org/abs/2111.10367 for more details. O…
☆65Feb 26, 2024Updated 2 years ago
nervjack2 / Speech2Unit
View on GitHub
☆13Sep 25, 2024Updated last year
voidful / asrp
View on GitHub
ASR text preprocessing utility
☆21Aug 5, 2024Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
facebookresearch / speech-resynthesis
View on GitHub
An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-S…
☆416Aug 29, 2023Updated 2 years ago
skit-ai / emotion-tts-dataset
View on GitHub
Dataset release for Emotional TTS in Indian Accent
☆41Mar 25, 2026Updated 4 months ago
atosystem / SpeechCLIP
View on GitHub
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022
☆120Nov 25, 2022Updated 3 years ago
pzelasko / kaldialign
View on GitHub
Python wrappers for Kaldi Levenshtein's distance and alignment code.
☆70Jun 15, 2026Updated last month
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
eastonYi / Unsupervised-ASR
View on GitHub
unsupervised ASR (mainly phone classifier) using EODM and GAN
☆12Oct 22, 2020Updated 5 years ago
cnaigithub / SpeechDewarping
View on GitHub
Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023
☆27Apr 27, 2023Updated 3 years ago
xinjli / transphone
View on GitHub
phoneme tokenizer and grapheme-to-phoneme model for 8k languages
☆174Jun 9, 2023Updated 3 years ago
lumaku / ctc-segmentation
View on GitHub
Segment an audio file and obtain utterance alignments. (Python package)
☆348May 15, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
maum-ai / maum-ai.github.io
View on GitHub
maum-ai.github.io
☆15Jun 12, 2026Updated last month
kahne / SpeechTransProgress
View on GitHub
Tracking the progress in end-to-end speech translation
☆260Oct 25, 2023Updated 2 years ago
vectominist / spin
View on GitHub
Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…
☆65May 19, 2023Updated 3 years ago
sungnyun / ARMHuBERT
View on GitHub
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
☆41Aug 29, 2024Updated last year
jindongwang / EasyEspnet
View on GitHub
Making Espnet easier to use
☆54Apr 9, 2021Updated 5 years ago
qiujiali / lattice-rescore
View on GitHub
☆16Jun 13, 2022Updated 4 years ago
Glaciohound / Chimera-ST
View on GitHub
A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021
☆47Feb 21, 2022Updated 4 years ago
ncsoft / avocodo
View on GitHub
Official implementation of "Avocodo: Generative Adversarial Network for Artifact-Free Vocoder" (AAAI2023)
☆154Feb 1, 2023Updated 3 years ago
asappresearch / sew
View on GitHub
☆77Oct 25, 2021Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
freds0 / CML-TTS-Dataset
View on GitHub
CML-TTS: A Multilingual Dataset for Speech Synthesis
☆36Jul 31, 2024Updated last year
miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
ddlBoJack / MT4SSL
View on GitHub
[INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…
☆45Mar 25, 2024Updated 2 years ago
jasonppy / PromptingWhisper
View on GitHub
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
☆151Jan 16, 2024Updated 2 years ago
AlanBaade / MAE-AST-Public
View on GitHub
Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
☆93Jun 9, 2022Updated 4 years ago
cpdu / vallt
View on GitHub
☆36Mar 14, 2025Updated last year
WelkinYang / GradTTS
View on GitHub
Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"
☆201Oct 31, 2023Updated 2 years ago