ga642381 / Speech-Prompts-AdaptersView external linksLinks
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
☆112Aug 4, 2023Updated 2 years ago
Alternatives and similar repositories for Speech-Prompts-Adapters
Users that are interested in Speech-Prompts-Adapters are comparing it to the libraries listed below
Sorting:
- A curated list of awesome adversarial reprogramming and input prompting methods for neural networks since 2022☆38Nov 30, 2023Updated 2 years ago
- 《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》☆77Jun 9, 2023Updated 2 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆64May 30, 2023Updated 2 years ago
- (R&D) Text to speech using phonemes as inputs and audio codec codes as outputs. Loosely based on MegaByte, VALL-E and Encodec.☆48Sep 4, 2023Updated 2 years ago
- 《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm☆82Oct 19, 2023Updated 2 years ago
- E2E TTS using Conditional Flow Matching (Experimental*)☆71Nov 10, 2023Updated 2 years ago
- Parallel waveform generation with DiffusionGAN☆17Mar 26, 2022Updated 3 years ago
- Open Source Speech/Text Data on AI☆19Sep 13, 2022Updated 3 years ago
- Speech Parameter Estimation Using Differentiable Speech Synthesizer☆44May 9, 2023Updated 2 years ago
- Official implementation of MelHuBERT☆68Oct 26, 2024Updated last year
- ☆19Mar 22, 2024Updated last year
- Onset-and-Offset-Aware Sound Event Detection☆20Feb 10, 2025Updated last year
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆43Sep 11, 2023Updated 2 years ago
- Official implementation of "Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis",…☆80May 29, 2023Updated 2 years ago
- Collect Voice Conversion researches☆96Updated this week
- Speaker-aware CTC (SACTC) for multi-talker overlapped speech recognition.☆21May 26, 2025Updated 8 months ago
- **ICASSP 2022** 《Toward Degradation-Robust Voice Conversion》Using speech enhancement and end-to-end denoising training to improve degrada…☆24Sep 27, 2022Updated 3 years ago
- Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion (Interspeech 2022)☆119Feb 7, 2024Updated 2 years ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- Torch implementation of Whisper-guided DDPM based Voice Conversion☆49Mar 7, 2023Updated 2 years ago
- ☆46Apr 16, 2023Updated 2 years ago
- INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"☆117Jan 26, 2024Updated 2 years ago
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆102Apr 10, 2025Updated 10 months ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆150Jan 16, 2024Updated 2 years ago
- ☆46Nov 2, 2023Updated 2 years ago
- AudioCodec-Hub is a Python library for encoding and decoding audio data, supporting various neural audio codec models☆25Sep 26, 2023Updated 2 years ago
- ☆25Mar 12, 2022Updated 3 years ago
- Keep track of big models in audio domain, including speech, singing, music etc.☆506Sep 26, 2024Updated last year
- ☆64May 23, 2022Updated 3 years ago
- Implementation for paper "Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE"☆44Apr 10, 2023Updated 2 years ago
- ☆37Jun 30, 2022Updated 3 years ago
- ☆31Jul 13, 2023Updated 2 years ago
- High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec☆115Jun 23, 2025Updated 7 months ago
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆90Jun 9, 2022Updated 3 years ago
- Implementation of DCComix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer☆75Aug 21, 2023Updated 2 years ago
- Unofficial pytorch reproduction for the paper "Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction" (…☆61Apr 4, 2024Updated last year
- EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction☆267May 19, 2024Updated last year
- Survey on speech generation work.☆21Nov 26, 2023Updated 2 years ago
- Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"☆68Dec 13, 2021Updated 4 years ago