ictnlp / BT4STLinks

Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".

☆12

Alternatives and similar repositories for BT4ST

Users that are interested in BT4ST are comparing it to the libraries listed below

Sorting:

openaudiolab / LLaST
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
☆25Updated last year
0nutation / DUB
Code and pretrained models for "DUB: Discrete Unit Back-translation for Speech Translation" (ACL 2023 Findings)
☆28Updated 2 years ago
YuanGongND / llm_speech_emotion_challenge
☆22Updated last year
TTS-Research / PEL-TTS
☆14Updated 2 years ago
nethermanpro / ComSL
☆11Updated 2 years ago
ictnlp / ComSpeech
Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".
☆25Updated last year
cwang621 / blsp-emo
BLSP-Emo: Towards Empathetic Large Speech-Language Models
☆55Updated last year
declare-lab / HyperTTS
☆38Updated last year
FreedomIntelligence / S2S-Arena
☆17Updated 6 months ago
kjw11 / Speaker-Aware-CTC
Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.
☆21Updated 6 months ago
NKU-HLT / DIFFA
[AAAI 2026] DIFFA: Large Language Diffusion Models Can Listen and Understand
☆40Updated last month
DanielLin94144 / StyleTalk
Official release of StyleTalk dataset.
☆70Updated last year
0nutation / SLMTokBench
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆37Updated 2 years ago
NKU-HLT / KNN-CTC
[ICASSP 2024] KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels
☆41Updated last year
youngsheen / GPST
[ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer
☆66Updated last year
WangHelin1997 / Automatic_Speech_Annotator
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Updated last year
auspicious3000 / ProsodyLM
ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models
☆31Updated last month
dreamtheater123 / VoxEval
Github repository for ACL 2025 paper: VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models
☆24Updated 6 months ago
ashi-ta / speechGLUE
SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.
☆13Updated 2 years ago
mt-upc / ZeroSwot
Pushing the Limits of Zero-shot End-to-End Speech Translation
☆26Updated last year
zeyuxie29 / AudioTime
☆36Updated last year
shinhyeokoh / rwen
☆14Updated 2 years ago
utter-project / mHuBERT-147-scripts
Collection of scripts from mHuBERT-147.
☆32Updated last year
ictnlp / DASpeech
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
☆63Updated last year
Rongjiehuang / awesome-speech-to-speech-translation
List of direct speech-to-speech translation papers.
☆38Updated 2 years ago
cpii-cai / PunCantonese
A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts
☆15Updated last year
yxduir / LLM-SRT
☆23Updated 2 weeks ago
kuan2jiu99 / Awesome-Speech-Generation
Survey on speech generation work.
☆21Updated 2 years ago
W-Wu / ERC-SLT22
Code for "Distribution-based Emotion Recognition in Conversation"
☆19Updated 2 years ago
audiodemo / voice-conversion
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Updated 2 years ago