Survey on speech generation work.
☆21Nov 26, 2023Updated 2 years ago
Alternatives and similar repositories for Awesome-Speech-Generation
Users that are interested in Awesome-Speech-Generation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022☆38Nov 30, 2023Updated 2 years ago
- Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM☆17Nov 7, 2024Updated last year
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆111Aug 4, 2023Updated 2 years ago
- Temporary anonymous version☆22Mar 20, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The open source code for LLM-Codec☆145Aug 18, 2024Updated last year
- ☆16Apr 4, 2022Updated 3 years ago
- ☆30Jul 21, 2022Updated 3 years ago
- Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge☆21Jul 25, 2022Updated 3 years ago
- ☆39Apr 15, 2024Updated last year
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆82Jun 7, 2024Updated last year
- Reference-aware automatic speech evaluation toolkit☆180Dec 5, 2024Updated last year
- ☆13Sep 25, 2024Updated last year
- Audio Codec Speech processing Universal PERformance Benchmark☆300Jan 8, 2026Updated 2 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A deepfake audio dataset for detecting fake speech from codec-based speech synthesis systems, Interspeech 2024☆20Jul 27, 2024Updated last year
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆50Apr 7, 2025Updated 11 months ago
- ☆18Sep 22, 2025Updated 6 months ago
- ☆24Jun 13, 2022Updated 3 years ago
- Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"☆14Feb 13, 2022Updated 4 years ago
- Official repository for NAST: Noise Aware Speech Tokenization for Speech Language Models (Interspeech 2024) https://arxiv.org/abs/2406.11…☆46Jul 2, 2024Updated last year
- S3PRL for Speech Emotion Recognition (see s3prl > downstream)☆15Feb 28, 2026Updated last month
- Source code of APNet2, a vocoder☆58Nov 23, 2023Updated 2 years ago
- BLSP-Emo: Towards Empathetic Large Speech-Language Models☆60Jun 7, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech und…☆42Mar 12, 2023Updated 3 years ago
- text to speech☆10Mar 19, 2024Updated 2 years ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Nov 10, 2023Updated 2 years ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Jun 2, 2023Updated 2 years ago
- Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.☆248Mar 7, 2025Updated last year
- ☆15Nov 10, 2025Updated 4 months ago
- Keep track of big models in audio domain, including speech, singing, music etc.☆506Sep 26, 2024Updated last year
- Awesome speech/audio LLMs, representation learning, and codec models☆1,215Aug 13, 2025Updated 7 months ago
- proof of concept conversation orchestrator with a speech-language model☆20Oct 19, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆13Sep 12, 2024Updated last year
- Official implementation of MelHuBERT☆69Feb 21, 2026Updated last month
- [Official Implementation] Acoustic Autoregressive Modeling 🔥☆75Aug 24, 2024Updated last year
- Official Implementation of LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language Models.☆33Nov 9, 2025Updated 4 months ago
- Project of Singing Voice Conversion.☆16Oct 27, 2023Updated 2 years ago
- CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding☆22Dec 17, 2025Updated 3 months ago
- This repository includes the code to reproduce our paper "End-to-end anti-spoofing with RawNet2" (https://arxiv.org/abs/2011.01108) publi…☆67Aug 8, 2023Updated 2 years ago