Code and pretrained models for "DUB: Discrete Unit Back-translation for Speech Translation" (ACL 2023 Findings)
☆28Jun 28, 2023Updated 2 years ago
Alternatives and similar repositories for DUB
Users that are interested in DUB are comparing it to the libraries listed below
Sorting:
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆12Oct 25, 2023Updated 2 years ago
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems☆85Jan 9, 2024Updated 2 years ago
- ☆13Aug 23, 2024Updated last year
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆152Sep 14, 2023Updated 2 years ago
- ☆38Oct 2, 2024Updated last year
- ☆31Oct 12, 2023Updated 2 years ago
- ☆12Jul 23, 2024Updated last year
- Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".☆36Oct 25, 2023Updated 2 years ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆31Sep 6, 2024Updated last year
- code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)☆65May 25, 2022Updated 3 years ago
- PyTorch Implementation of TranSpeech (ICLR'23): Textless NAR Speech-to-Speech Translation with Bilateral Perturbation☆182Jun 20, 2024Updated last year
- [AAAI 2024] DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning☆15Apr 29, 2024Updated last year
- SiLLM is a Simultaneous Machine Translation (SiMT) Framework. It utilizes a Large Language model as the translation model and employs a t…☆18Feb 22, 2024Updated 2 years ago
- Temporary anonymous version☆22Mar 20, 2024Updated 2 years ago
- ☆25Feb 12, 2023Updated 3 years ago
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆51Sep 20, 2025Updated 6 months ago
- ☆19Nov 6, 2023Updated 2 years ago
- Tracking the progress in end-to-end speech translation☆261Oct 25, 2023Updated 2 years ago
- Word alignments generated by the Montreal Forced Aligner for the Librispeech dataset☆178Mar 25, 2019Updated 6 years ago
- SpeechGPT Series: Speech Large Language Models☆1,407Jul 22, 2024Updated last year
- Implementation of SoundStorm built upon SpeechTokenizer.☆116Nov 2, 2023Updated 2 years ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆18Jul 16, 2024Updated last year
- Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".☆12Jan 4, 2024Updated 2 years ago
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes☆11Oct 19, 2023Updated 2 years ago
- ☆25Jun 19, 2025Updated 9 months ago
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆15Jun 23, 2024Updated last year
- Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".☆17Oct 25, 2023Updated 2 years ago
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆36Sep 9, 2025Updated 6 months ago
- ☆14Jun 16, 2023Updated 2 years ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆142Apr 27, 2024Updated last year
- Implementation of Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt (NAACL'24).☆121Jan 26, 2025Updated last year
- End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions☆94Nov 6, 2023Updated 2 years ago
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆61Apr 4, 2024Updated last year
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"☆37Dec 6, 2023Updated 2 years ago
- Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"☆13Nov 3, 2022Updated 3 years ago