☆37Jun 30, 2022Updated 3 years ago
Alternatives and similar repositories for WavPrompt
Users that are interested in WavPrompt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17May 5, 2024Updated last year
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆93Jun 9, 2022Updated 3 years ago
- ☆16Jun 13, 2022Updated 3 years ago
- ☆12Aug 25, 2023Updated 2 years ago
- Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".☆54Jul 16, 2025Updated 9 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆25Mar 12, 2022Updated 4 years ago
- Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…☆27May 17, 2023Updated 2 years ago
- ☆15Jul 4, 2024Updated last year
- ☆18Jul 22, 2024Updated last year
- STRODE: Stochastic Boundary Ordinary Differential Equation☆13Jul 20, 2021Updated 4 years ago
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆34Jun 1, 2023Updated 2 years ago
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆119Nov 25, 2022Updated 3 years ago
- This is the implementation of the manuscript "Learning General All-Neural Speech Enhancement based on Taylor's Approximation Theory", whi…☆14Nov 25, 2022Updated 3 years ago
- Code for the paper "JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis"☆14Nov 5, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆64May 23, 2022Updated 3 years ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆201Dec 13, 2024Updated last year
- ☆10Sep 19, 2022Updated 3 years ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 3 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆111Aug 4, 2023Updated 2 years ago
- An Audio Language model for Audio Tasks☆320Apr 19, 2024Updated 2 years ago
- ☆12Jun 10, 2021Updated 4 years ago
- A toolkit for researchers in the multimodal sound separation.☆16Oct 20, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆31Jun 12, 2025Updated 10 months ago
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆31Sep 20, 2025Updated 7 months ago
- Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"☆68Dec 13, 2021Updated 4 years ago
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Aug 10, 2023Updated 2 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆18Jun 12, 2022Updated 3 years ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆40Jan 6, 2024Updated 2 years ago
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆38Jan 17, 2024Updated 2 years ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆74Sep 26, 2022Updated 3 years ago
- ☆15Apr 2, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Data and code related to the ICASSP submission "A comparison of methods for OOV-word recognition"☆17Nov 28, 2021Updated 4 years ago
- ☆37Mar 30, 2021Updated 5 years ago
- asr2k☆52Jun 2, 2024Updated last year
- experiments with RETURNN☆162Feb 7, 2026Updated 2 months ago
- This is the code for controllable EVC framework for seen and unseen emotion generation.☆45Nov 3, 2021Updated 4 years ago
- Audio Codec Speech processing Universal PERformance Benchmark☆301Apr 1, 2026Updated last month
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆120May 19, 2025Updated 11 months ago