A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.
☆77Oct 22, 2024Updated last year
Alternatives and similar repositories for NAST-S2x
Users that are interested in NAST-S2x are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".☆26Jul 2, 2024Updated last year
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆110May 20, 2025Updated 10 months ago
- Pushing the Limits of Zero-shot End-to-End Speech Translation☆26Dec 12, 2024Updated last year
- The code for AAAI 2025 “Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation”☆15Jan 3, 2025Updated last year
- Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".☆17Oct 25, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"☆13Nov 3, 2022Updated 3 years ago
- Code for NeurIPS 2023 paper "Non-autoregressive Machine Translation with Probabilistic Context-free Grammar".☆12Jan 4, 2024Updated 2 years ago
- DST is a Decoder-only simultaneous machine translation model, which can conduct policy decision and translation concurrently☆11Jun 6, 2024Updated last year
- Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".☆63Jul 22, 2024Updated last year
- [InterSpeech'2024] FluentEditor:Text-based Speech Editing by Considering Acoustic and Prosody Consistency☆61Oct 23, 2024Updated last year
- text to speech☆10Mar 19, 2024Updated 2 years ago
- ☆14Jun 16, 2023Updated 2 years ago
- This is the official train-dev-test release of the Interspeech2024 Discrete Speech Representation Challenge.☆32Jan 26, 2024Updated 2 years ago
- ☆55Jul 16, 2025Updated 8 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models☆27Aug 11, 2024Updated last year
- GPT for FACodec☆13Mar 25, 2024Updated 2 years ago
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆111Apr 1, 2024Updated last year
- ☆19Mar 22, 2024Updated 2 years ago
- ☆39Apr 15, 2024Updated last year
- A Toolkit for a series of Young projects.☆23Apr 30, 2021Updated 4 years ago
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆142Apr 27, 2024Updated last year
- Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".☆36Oct 25, 2023Updated 2 years ago
- Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"☆37Dec 6, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- DDPM-based Pitch Generation and Pitch Controllable Voice Synthesis.☆54Sep 25, 2023Updated 2 years ago
- PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions☆85Oct 11, 2024Updated last year
- Implementation of TTS model based on NVIDIA P-Flow TTS Paper☆77May 12, 2024Updated last year
- StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.☆1,254Jun 29, 2025Updated 9 months ago
- SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems☆39Nov 1, 2023Updated 2 years ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆57Oct 31, 2023Updated 2 years ago
- The open source code for SimpleSpeech series☆145Oct 8, 2024Updated last year
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆17May 20, 2025Updated 10 months ago
- ☆13Aug 23, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Generative Expressive Conversational Speech Synthesis (Accepted by MM'2024)☆78Nov 1, 2024Updated last year
- ☆25Mar 6, 2024Updated 2 years ago
- Official implementation for the paper: A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Unit…☆83Jan 7, 2023Updated 3 years ago
- All generative model in one for better TTS model☆74Sep 8, 2024Updated last year
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- ☆15Nov 11, 2024Updated last year
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆61Apr 4, 2024Updated last year