A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
☆78Oct 22, 2024Updated last year
Alternatives and similar repositories for NAST-S2x
Users that are interested in NAST-S2x are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆26Jul 2, 2024Updated last year
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆110May 20, 2025Updated 11 months ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆25Dec 12, 2024Updated last year
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".☆16Oct 25, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"☆13Nov 3, 2022Updated 3 years ago
- Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".☆12Jan 4, 2024Updated 2 years ago
- DST is a Decoder-only simultaneous machine translation model, which can conduct policy decision and translation concurrently☆11Jun 6, 2024Updated last year
- Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".☆63Jul 22, 2024Updated last year
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆62Oct 23, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated 2 years ago
- ☆14Jun 16, 2023Updated 2 years ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Jan 26, 2024Updated 2 years ago
- ☆54Jul 16, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models☆26Aug 11, 2024Updated last year
- GPT for FACodec☆13Mar 25, 2024Updated 2 years ago
- ☆19Mar 22, 2024Updated 2 years ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆112Apr 1, 2024Updated 2 years ago
- ☆40Apr 15, 2024Updated 2 years ago
- A Toolkit for a series of Young projects.☆23Apr 30, 2021Updated 5 years ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆141Apr 27, 2024Updated 2 years ago
- Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".☆35Oct 25, 2023Updated 2 years ago
- Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"☆37Dec 6, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- DDPM-based Pitch Generation and Pitch Controllable Voice Synthesis.☆55Sep 25, 2023Updated 2 years ago
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions☆86Oct 11, 2024Updated last year
- Implementation of TTS model based on NVIDIA P-Flow TTS Paper☆77May 12, 2024Updated last year
- StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.☆1,264Jun 29, 2025Updated 10 months ago
- SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems☆39Nov 1, 2023Updated 2 years ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆57Oct 31, 2023Updated 2 years ago
- The open source code for SimpleSpeech series☆145Oct 8, 2024Updated last year
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆18May 20, 2025Updated 11 months ago
- ☆13Aug 23, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆79Nov 1, 2024Updated last year
- Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…☆83Jan 7, 2023Updated 3 years ago
- ☆25Mar 6, 2024Updated 2 years ago
- All generative model in one for better TTS model☆74Sep 8, 2024Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- ☆15Nov 11, 2024Updated last year
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆60Apr 4, 2024Updated 2 years ago