RF5 / transfusion-asrView external linksLinks
Transcribing Speech with Multinomial Diffusion, training code and models.
☆80Sep 27, 2023Updated 2 years ago
Alternatives and similar repositories for transfusion-asr
Users that are interested in transfusion-asr are comparing it to the libraries listed below
Sorting:
- Training code and trained checkpoints for ASGAN.☆62Dec 27, 2023Updated 2 years ago
- ☆17Aug 27, 2025Updated 5 months ago
- ☆55Jan 13, 2023Updated 3 years ago
- multilingual speech aligner☆76Nov 19, 2023Updated 2 years ago
- Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training☆41Dec 18, 2020Updated 5 years ago
- [SpeechCom Journal] Learning and controlling the source-filter representation of speech with a variational autoencoder☆45Apr 18, 2023Updated 2 years ago
- ICASSP 2023 Accepted☆189May 6, 2024Updated last year
- Segment a given audio into utterances using a trained end-to-end ASR model.☆74Oct 9, 2020Updated 5 years ago
- NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis☆150Feb 11, 2023Updated 3 years ago
- Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE☆15Nov 30, 2022Updated 3 years ago
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- ☆25Mar 12, 2022Updated 3 years ago
- Standalone implementation of the CUDA-accelerated WFST Decoder available in Riva☆91Feb 18, 2025Updated 11 months ago
- Text-To-Speech for NotebookLM☆37Jul 20, 2025Updated 6 months ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Jun 5, 2020Updated 5 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- ☆14Aug 16, 2023Updated 2 years ago
- Simple Kaldi recipe for forced alignment☆11Jul 16, 2023Updated 2 years ago
- Reference-aware automatic speech evaluation toolkit☆178Dec 5, 2024Updated last year
- A JAX library for building lattice-based speech transducer models☆46Jan 8, 2026Updated last month
- Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023☆251Jun 5, 2025Updated 8 months ago
- Extract phoneme-level timestamps from speeh audio.☆116Updated this week
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆59Jul 1, 2025Updated 7 months ago
- ☆16Jun 13, 2022Updated 3 years ago
- ☆259May 15, 2023Updated 2 years ago
- ☆46Apr 16, 2023Updated 2 years ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆68Jan 5, 2026Updated last month
- Official code for Wav2Seq☆97Jul 19, 2022Updated 3 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆52Dec 6, 2022Updated 3 years ago
- Simple tool for speech dataset augmentation for modeling various prosodies.☆14Jan 14, 2021Updated 5 years ago
- Voice conversion with just linear regression.☆33Sep 25, 2025Updated 4 months ago
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆71Dec 2, 2022Updated 3 years ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- Viterbi decoding in PyTorch☆40Sep 10, 2025Updated 5 months ago
- BigVGAN with Neural Source-Filter☆56Sep 21, 2023Updated 2 years ago
- Memory efficient transducer loss computation☆69Jun 10, 2022Updated 3 years ago
- ☆163Sep 19, 2022Updated 3 years ago
- Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)☆124Jun 16, 2022Updated 3 years ago