theodorblackbird / lina-speech
lina-speech : linear attention based text-to-speech
☆134Updated this week
Related projects ⓘ
Alternatives and complementary repositories for lina-speech
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- VALL-E 2 reproduction☆83Updated 3 months ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆136Updated last year
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆44Updated 2 months ago
- Implementation of SoundStorm built upon SpeechTokenizer.☆103Updated last year
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆179Updated 2 months ago
- ☆76Updated 2 months ago
- Train the next generation of TTS systems.☆160Updated last month
- Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.☆146Updated 2 months ago
- The official Implementation of PeriodWave and PeriodWave-Turbo☆128Updated 2 months ago
- All generative model in one for better TTS model☆66Updated 2 months ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆121Updated 8 months ago
- This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the …☆31Updated last week
- ☆57Updated 2 months ago
- Official implementation of Vec-Tok Speech☆93Updated last year
- [ICASSP 2024] StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations☆135Updated 6 months ago
- Style-Controllable Zero-Shot Text to Speech Synthesizer based on VALL-E☆135Updated 2 weeks ago
- DEX-TTS: Diffusion-based EXpressive TTS with Style Modeling on Time Variability☆89Updated last week
- Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice☆128Updated 3 weeks ago
- Real-time Speech-Text Foundation Model Toolkit (wip)☆119Updated 3 weeks ago
- Putting flows on top of neural transducers for better TTS☆64Updated last week
- AudioBench: A Universal Benchmark for Audio Large Language Models☆89Updated last month
- VoiceBox neural network implementation☆96Updated 3 months ago
- PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.☆189Updated last month
- ☆70Updated last year
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆114Updated last week
- PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.☆155Updated 7 months ago
- SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer.☆62Updated last week
- Audiogen Codec☆126Updated 4 months ago
- ☆100Updated last month