ictnlp / NAST-S2xView external linksLinks
A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
☆76Oct 22, 2024Updated last year
Alternatives and similar repositories for NAST-S2x
Users that are interested in NAST-S2x are comparing it to the libraries listed below
Sorting:
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆25Jul 2, 2024Updated last year
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated last year
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆110May 20, 2025Updated 8 months ago
- ☆52Jul 16, 2025Updated 6 months ago
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆59Oct 23, 2024Updated last year
- ☆14Jun 16, 2023Updated 2 years ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Jan 26, 2024Updated 2 years ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆111Apr 1, 2024Updated last year
- ☆38Apr 15, 2024Updated last year
- ☆19Mar 22, 2024Updated last year
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions☆83Oct 11, 2024Updated last year
- Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".☆17Oct 25, 2023Updated 2 years ago
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".☆63Jul 22, 2024Updated last year
- DDPM-based Pitch Generation and Pitch Controllable Voice Synthesis.☆54Sep 25, 2023Updated 2 years ago
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆78Nov 1, 2024Updated last year
- LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models☆25Aug 11, 2024Updated last year
- Implementation of TTS model based on NVIDIA P-Flow TTS Paper☆77May 12, 2024Updated last year
- Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…☆83Jan 7, 2023Updated 3 years ago
- SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems☆39Nov 1, 2023Updated 2 years ago
- All generative model in one for better TTS model☆74Sep 8, 2024Updated last year
- ☆25Mar 6, 2024Updated last year
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆57Oct 31, 2023Updated 2 years ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆142Apr 27, 2024Updated last year
- ☆15Nov 11, 2024Updated last year
- ☆22Jul 30, 2025Updated 6 months ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Nov 10, 2023Updated 2 years ago
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆61Apr 4, 2024Updated last year
- Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".☆12Jan 4, 2024Updated 2 years ago
- Simple tool for speech dataset augmentation for modeling various prosodies.☆14Jan 14, 2021Updated 5 years ago
- DST is a Decoder-only simultaneous machine translation model, which can conduct policy decision and translation concurrently☆11Jun 6, 2024Updated last year
- GPT for FACodec☆13Mar 25, 2024Updated last year
- ☆36Mar 14, 2025Updated 11 months ago
- The open source code for SimpleSpeech series☆145Oct 8, 2024Updated last year
- ☆68Jul 29, 2023Updated 2 years ago
- [Findings of NAACL 2024] Source code of paper CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers a…☆69Mar 31, 2024Updated last year
- BigVGAN with Neural Source-Filter☆56Sep 21, 2023Updated 2 years ago
- [ACL 2024] This is the Pytorch code for our paper "StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing"☆97Nov 14, 2024Updated last year