☆37Jun 30, 2022Updated 3 years ago
Alternatives and similar repositories for WavPrompt
Users that are interested in WavPrompt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17May 5, 2024Updated last year
- Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer☆93Jun 9, 2022Updated 3 years ago
- ☆16Jun 13, 2022Updated 3 years ago
- ☆12Aug 25, 2023Updated 2 years ago
- Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".☆54Jul 16, 2025Updated 8 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆25Mar 12, 2022Updated 4 years ago
- Code repository for the paper "Improving End-to-End SLU performance with Prosodic Attention and Distillation" accepted at Interspeech 202…☆27May 17, 2023Updated 2 years ago
- ☆15Jul 4, 2024Updated last year
- ☆17Jul 22, 2024Updated last year
- STRODE: Stochastic Boundary Ordinary Differential Equation☆13Jul 20, 2021Updated 4 years ago
- TriNet: stabilizing self-supervised learning from complete or slow collapse on ASR.☆34Jun 1, 2023Updated 2 years ago
- SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model, Accepted to IEEE SLT 2022☆119Nov 25, 2022Updated 3 years ago
- This is the implementation of the manuscript "Learning General All-Neural Speech Enhancement based on Taylor's Approximation Theory", whi…☆14Nov 25, 2022Updated 3 years ago
- Code for the paper "JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis"☆14Nov 5, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆64May 23, 2022Updated 3 years ago
- A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline☆198Dec 13, 2024Updated last year
- ☆10Sep 19, 2022Updated 3 years ago
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Mar 15, 2023Updated 3 years ago
- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks☆17Aug 18, 2023Updated 2 years ago
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆111Aug 4, 2023Updated 2 years ago
- An Audio Language model for Audio Tasks☆319Apr 19, 2024Updated last year
- ☆12Jun 10, 2021Updated 4 years ago
- A toolkit for researchers in the multimodal sound separation.☆16Oct 20, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆30Sep 20, 2025Updated 6 months ago
- ☆30Jun 12, 2025Updated 10 months ago
- Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"☆68Dec 13, 2021Updated 4 years ago
- Textless (ASR-transcript free) Spoken Question Answering. The official release of NMSQA dataset and the implementation of "DUAL: Textless…☆35Aug 10, 2023Updated 2 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆18Jun 12, 2022Updated 3 years ago
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆39Jan 6, 2024Updated 2 years ago
- A robust pitch tracker using synchro-squeezed fft and frequency domain autocorrelation☆36Jan 17, 2024Updated 2 years ago
- LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT☆74Sep 26, 2022Updated 3 years ago
- ☆15Apr 2, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Data and code related to the ICASSP submission "A comparison of methods for OOV-word recognition"☆17Nov 28, 2021Updated 4 years ago
- ☆37Mar 30, 2021Updated 5 years ago
- asr2k☆52Jun 2, 2024Updated last year
- experiments with RETURNN☆161Feb 7, 2026Updated 2 months ago
- This is the code for controllable EVC framework for seen and unseen emotion generation.☆45Nov 3, 2021Updated 4 years ago
- Audio Codec Speech processing Universal PERformance Benchmark☆301Apr 1, 2026Updated last week
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆119May 19, 2025Updated 10 months ago