ictnlp / MonoAttn-TransducerLinks

Code for ICML25 Paper "Overcoming Non-monotonicity in Transducer-based Streaming Generation"

☆11

Alternatives and similar repositories for MonoAttn-Transducer

Users that are interested in MonoAttn-Transducer are comparing it to the libraries listed below

Sorting:

D-Keqi / LS-Transducer-SST
☆11Updated last year
ictnlp / BT4ST
Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".
☆12Updated last year
openaudiolab / LLaST
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
☆25Updated 11 months ago
cpii-cai / PunCantonese
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆14Updated 7 months ago
leduckhai / MultiMed-ST
MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation
☆13Updated 3 months ago
reppy4620 / x-vits
☆13Updated 8 months ago
BUTSpeechFIT / OOV-recovery-in-hybrid-ASR-system
☆9Updated 5 years ago
nethermanpro / ComSL
☆11Updated last year
xuchennlp / S2T
The project for speech translation
☆11Updated last year
0nutation / SLMTokBench
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆37Updated last year
mcf330 / efts2code
source code of EfficientTTS 2
☆14Updated last year
iamanigeeit / present
☆13Updated 10 months ago
audiodemo / voice-conversion
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Updated last year
bshall / dusted
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Updated 9 months ago
TTS-Research / PEL-TTS
☆14Updated last year
cadia-lvl / ss_asr
A semi-supervised sequence-to-sequence ASR
☆10Updated 2 years ago
JSALT2022CodeSwitchingASR / generating-code-switched-audio
☆12Updated 5 months ago
NKU-HLT / KNN-CTC
[ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels
☆38Updated last year
ttslr / MonTTS
☆13Updated 3 years ago
Berkeley-Speech-Group / DysfluentWFST
DysfluentWFST
☆12Updated last month
ex3ndr / supervoice-vocoder
Production-ready vocoder using BigVSAN
☆11Updated last year
csalt-research / accented-codebooks-asr
☆18Updated 10 months ago
atosystem / SSL_Interface
Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024
☆15Updated 7 months ago
ag1988 / mel-asr
The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…
☆19Updated 9 months ago
Sreyan88 / LipGER
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆17Updated last year
utter-project / mHuBERT-147-scripts
Collection of scripts from mHuBERT-147.
☆29Updated 7 months ago
JeongHun0716 / e-mvsr
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)
☆16Updated 3 months ago
mushanshanshan / ESLTTS
ESLTTS dataset
☆16Updated 5 months ago
rithiksachdev / PostASR-Correction-SLT2024
☆14Updated 11 months ago
ShovalMessica / NAST
Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…
☆46Updated last year