Code and pretrained models for "DUB: Discrete Unit Back-translation for Speech Translation" (ACL 2023 Findings)
☆28Jun 28, 2023Updated 2 years ago
Alternatives and similar repositories for DUB
Users that are interested in DUB are comparing it to the libraries listed below
Sorting:
- ☆13Aug 23, 2024Updated last year
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆12Oct 25, 2023Updated 2 years ago
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- ☆31Oct 12, 2023Updated 2 years ago
- Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".☆36Oct 25, 2023Updated 2 years ago
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems☆85Jan 9, 2024Updated 2 years ago
- code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)☆65May 25, 2022Updated 3 years ago
- SiLLM is a Simultaneous Machine Translation (SiMT) Framework. It utilizes a Large Language model as the translation model and employs a t…☆18Feb 22, 2024Updated 2 years ago
- ☆38Oct 2, 2024Updated last year
- Temporary anonymous version☆22Mar 20, 2024Updated last year
- PyTorch implementation of Retriever: Learning Content-Style Representation☆12Jan 27, 2023Updated 3 years ago
- PyTorch Implementation of TranSpeech (ICLR'23): Textless NAR Speech-to-Speech Translation with Bilateral Perturbation☆178Jun 20, 2024Updated last year
- A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes☆11Oct 19, 2023Updated 2 years ago
- ☆14Jun 16, 2023Updated 2 years ago
- PorSimplesSent - A Portuguese corpus of aligned sentences pairs to investigate sentence readability assessment☆13Jan 15, 2020Updated 6 years ago
- Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasonin…☆25Updated this week
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆152Sep 14, 2023Updated 2 years ago
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆51Sep 20, 2025Updated 5 months ago
- ☆12Jul 23, 2024Updated last year
- ☆25Feb 12, 2023Updated 3 years ago
- Simple tool for speech dataset augmentation for modeling various prosodies.☆14Jan 14, 2021Updated 5 years ago
- Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".☆12Jan 4, 2024Updated 2 years ago
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning☆16Jun 23, 2024Updated last year
- Tracking the progress in end-to-end speech translation☆261Oct 25, 2023Updated 2 years ago
- Word alignments generated by the Montreal Forced Aligner for the Librispeech dataset☆178Mar 25, 2019Updated 6 years ago
- [TASLP 2024] Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation☆31Sep 6, 2024Updated last year
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge☆12Jun 11, 2024Updated last year
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆57Oct 31, 2023Updated 2 years ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆62Nov 1, 2024Updated last year
- End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions☆94Nov 6, 2023Updated 2 years ago
- Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition☆18Jul 16, 2024Updated last year
- Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".☆17Oct 25, 2023Updated 2 years ago
- [AAAI 2024] DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning☆15Apr 29, 2024Updated last year
- Code for "Distribution-based Emotion Recognition in Conversation"☆19Feb 6, 2023Updated 3 years ago
- Code for the submitted 2021 DCASE Workshop paper: "Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal F…☆16Aug 9, 2021Updated 4 years ago
- Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"☆13Nov 3, 2022Updated 3 years ago
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆17May 20, 2025Updated 9 months ago