This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
☆111Aug 4, 2023Updated 2 years ago
Alternatives and similar repositories for Speech-Prompts-Adapters
Users that are interested in Speech-Prompts-Adapters are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022☆38Nov 30, 2023Updated 2 years ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆77Jun 9, 2023Updated 2 years ago
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆42Sep 11, 2023Updated 2 years ago
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆81Oct 19, 2023Updated 2 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64May 30, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆46Feb 16, 2023Updated 3 years ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Nov 10, 2023Updated 2 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆150Jan 16, 2024Updated 2 years ago
- Survey on speech generation work.☆21Nov 26, 2023Updated 2 years ago
- Parallel waveform generation with DiffusionGAN☆17Mar 26, 2022Updated 4 years ago
- Open Source Speech/Text Data on AI☆19Sep 13, 2022Updated 3 years ago
- Official implementation of MelHuBERT☆69Feb 21, 2026Updated last month
- EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction☆269May 19, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Keep track of big models in audio domain, including speech, singing, music etc.☆506Sep 26, 2024Updated last year
- Audio Codec Speech processing Universal PERformance Benchmark☆300Jan 8, 2026Updated 2 months ago
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆22May 26, 2025Updated 10 months ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆102Apr 10, 2025Updated 11 months ago
- Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)☆42Oct 13, 2023Updated 2 years ago
- **ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degrada…☆24Sep 27, 2022Updated 3 years ago
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆25Sep 26, 2023Updated 2 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- ☆64May 23, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Awesome speech/audio LLMs, representation learning, and codec models☆1,212Aug 13, 2025Updated 7 months ago
- Onset-and-Offset-Aware Sound Event Detection☆21Feb 10, 2025Updated last year
- Speech Parameter Estimation Using Differentiable Speech Synthesizer☆44May 9, 2023Updated 2 years ago
- ☆37Jun 30, 2022Updated 3 years ago
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)☆119Feb 7, 2024Updated 2 years ago
- Official implementation of "Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis",…☆80May 29, 2023Updated 2 years ago
- ☆19Mar 22, 2024Updated 2 years ago
- ☆19Sep 10, 2024Updated last year
- Collect Voice Conversion researches☆96Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"☆117Jan 26, 2024Updated 2 years ago
- High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec☆116Jun 23, 2025Updated 9 months ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆80Sep 27, 2023Updated 2 years ago
- An Open-source Streaming High-fidelity Neural Audio Codec☆502Mar 4, 2025Updated last year
- ☆30Jun 12, 2025Updated 9 months ago
- A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project g…☆147Jun 6, 2022Updated 3 years ago
- BERT and LSTM baseline models of the ZeroSpeech Challenge 2021☆60Oct 19, 2022Updated 3 years ago