☆37Jun 30, 2022Updated 3 years ago
Alternatives and similar repositories for WavPrompt
Users that are interested in WavPrompt are comparing it to the libraries listed below
Sorting:
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆91Jun 9, 2022Updated 3 years ago
- ☆17May 5, 2024Updated last year
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 2 years ago
- ☆16Jun 13, 2022Updated 3 years ago
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆26Jun 1, 2023Updated 2 years ago
- Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…☆27May 17, 2023Updated 2 years ago
- ☆10Sep 19, 2022Updated 3 years ago
- Code for the paper "JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis"☆14Nov 5, 2024Updated last year
- ☆25Mar 12, 2022Updated 3 years ago
- ☆15Apr 2, 2025Updated 11 months ago
- This is the implementation of the manuscript "Learning General All-Neural Speech Enhancement based on Taylor's Approximation Theory", whi…☆14Nov 25, 2022Updated 3 years ago
- ☆12Aug 25, 2023Updated 2 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- ☆12Jun 10, 2021Updated 4 years ago
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆111Aug 4, 2023Updated 2 years ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆196Dec 13, 2024Updated last year
- A toolkit for researchers in the multimodal sound separation.☆16Oct 20, 2023Updated 2 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆18Jun 12, 2022Updated 3 years ago
- Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".☆54Jul 16, 2025Updated 7 months ago
- The accompanying code for "Exploring the limits of decoder-only models trained on public speech recognition corpora" (Ankit Gupta, George…☆20Oct 11, 2024Updated last year
- ☆17Jul 22, 2024Updated last year
- ☆15Jul 4, 2024Updated last year
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆36Jan 17, 2024Updated 2 years ago
- ☆30Jun 12, 2025Updated 8 months ago
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Aug 10, 2023Updated 2 years ago
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆119Nov 25, 2022Updated 3 years ago
- ☆37Mar 30, 2021Updated 4 years ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆74Sep 26, 2022Updated 3 years ago
- Data and code related to the ICASSP submission "A comparison of methods for OOV-word recognition"☆17Nov 28, 2021Updated 4 years ago
- Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.☆39Mar 4, 2024Updated last year
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆38Jan 6, 2024Updated 2 years ago
- Official Pytorch implementation of PULSE: Positive–Unlabelled Learning for audio Signal Enhancement (Best Paper Award at ICASSP 2023)☆43Jul 24, 2023Updated 2 years ago
- PodcastMix A dataset for separating music and speech in podcasts.☆44Aug 20, 2024Updated last year
- The VoxTube dataset official repository☆71Feb 14, 2024Updated 2 years ago
- ☆64May 23, 2022Updated 3 years ago
- An Audio Language model for Audio Tasks☆319Apr 19, 2024Updated last year
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Dec 4, 2023Updated 2 years ago
- ☆24Feb 28, 2023Updated 3 years ago
- experiments with RETURNN☆161Feb 7, 2026Updated 3 weeks ago