RF5/transfusion-asr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RF5/transfusion-asr)

RF5 / transfusion-asr

Transcribing Speech with Multinomial Diffusion, training code and models.

☆80

Alternatives and similar repositories for transfusion-asr

Users that are interested in transfusion-asr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RF5 / simple-asgan
View on GitHub
Training code and trained checkpoints for ASGAN.
☆62Dec 27, 2023Updated 2 years ago
RF5 / simple-autovc
View on GitHub
A simple, performant re-implementation of AutoVC
☆22Jul 6, 2023Updated 3 years ago
nc-ai / speech
View on GitHub
☆17Aug 27, 2025Updated 10 months ago
xinjli / alqalign
View on GitHub
multilingual speech aligner
☆78Nov 19, 2023Updated 2 years ago
desh2608 / kaldi-noise-vectors
View on GitHub
Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.
☆13Feb 13, 2021Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
samsad35 / source-filter-vae
View on GitHub
[SpeechCom Journal] Learning and controlling the source-filter representation of speech with a variational autoencoder
☆46Apr 18, 2023Updated 3 years ago
desh2608 / pytorch-tdnn
View on GitHub
Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training
☆41Dec 18, 2020Updated 5 years ago
MiniXC / LightningFastSpeech2
View on GitHub
☆55Jan 13, 2023Updated 3 years ago
maum-ai / phaseaug
View on GitHub
ICASSP 2023 Accepted
☆191May 6, 2024Updated 2 years ago
kamperh / linearvc
View on GitHub
Voice conversion with just linear regression.
☆37Sep 25, 2025Updated 10 months ago
cyfer0618 / kaldi-pytorch-rnnlm
View on GitHub
Enable RNNLM lattice rescoring with Pytorch [kaldi]
☆12Jun 5, 2020Updated 6 years ago
cornerfarmer / ctc_segmentation
View on GitHub
Segment a given audio into utterances using a trained end-to-end ASR model.
☆75Oct 9, 2020Updated 5 years ago
nvidia-riva / riva-asrlib-decoder
View on GitHub
Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva
☆91Feb 18, 2025Updated last year
pzelasko / kaldialign
View on GitHub
Python wrappers for Kaldi Levenshtein's distance and alignment code.
☆70Jun 15, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Top34051 / stargan-zsvc
View on GitHub
Unofficial PyTorch Implementation of StarGAN-ZSVC
☆14Aug 5, 2021Updated 4 years ago
luomingshuang / k2-speechbrain
View on GitHub
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
☆16Jun 17, 2022Updated 4 years ago
NeuroWave-ai / CUCVAE-TTS
View on GitHub
☆25Mar 12, 2022Updated 4 years ago
Takaaki-Saeki / DiscreteSpeechMetrics
View on GitHub
Reference-aware automatic speech evaluation toolkit
☆185Dec 5, 2024Updated last year
asappresearch / wav2seq
View on GitHub
Official code for Wav2Seq
☆97Jul 19, 2022Updated 4 years ago
tts-tutorial / interspeech2022
View on GitHub
☆162Sep 19, 2022Updated 3 years ago
nii-yamagishilab / speaker_sex_attribute_privacy
View on GitHub
Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE
☆15Nov 30, 2022Updated 3 years ago
atosystem / SpeechCLIP
View on GitHub
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022
☆120Nov 25, 2022Updated 3 years ago
revsic / torch-nansypp
View on GitHub
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
☆152Feb 11, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
b04901014 / MQTTS
View on GitHub
☆260May 15, 2023Updated 3 years ago
revsic / torch-retriever-vc
View on GitHub
PyTorch implementation of Retriever: Learning Content-Style Representation
☆12Jan 27, 2023Updated 3 years ago
AlanBaade / SyllableLM
View on GitHub
Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models
☆63Jul 1, 2025Updated last year
keonlee9420 / DailyTalk
View on GitHub
Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
☆260Jun 5, 2025Updated last year
yuan1615 / AdaVocoder
View on GitHub
Adaptive Vocoder for Custom Voice
☆61Sep 22, 2022Updated 3 years ago
WangHelin1997 / DuTa-VC
View on GitHub
Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion P…
☆38Dec 5, 2023Updated 2 years ago
maxrmorrison / torbi
View on GitHub
Viterbi decoding in PyTorch
☆42May 5, 2026Updated 2 months ago
yl4579 / AuxiliaryASR
View on GitHub
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)
☆127Jun 16, 2022Updated 4 years ago
tts-tutorial / icassp2022
View on GitHub
☆64May 23, 2022Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
shanguanma / Aligners
View on GitHub
HMM, CTC, RNN-Transducer, forward-backward algorithm
☆20Sep 5, 2023Updated 2 years ago
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
qiujiali / lattice-rescore
View on GitHub
☆16Jun 13, 2022Updated 4 years ago
google-research / last
View on GitHub
A JAX library for building lattice-based speech transducer models
☆48Jul 2, 2026Updated 3 weeks ago
cnaigithub / SpeechDewarping
View on GitHub
Official implementation of "Unsupervised Pre-training for Data-Efficient Text-to-Speech on Low Resource Languages", ICASSP 2023
☆27Apr 27, 2023Updated 3 years ago
csukuangfj / optimized_transducer
View on GitHub
Memory efficient transducer loss computation
☆70Jun 10, 2022Updated 4 years ago
shang0712 / HierTTS
View on GitHub
☆47Apr 16, 2023Updated 3 years ago