Acoustic Neighbor Embeddings
☆28Jul 13, 2025Updated 7 months ago
Alternatives and similar repositories for ml-acn-embed
Users that are interested in ml-acn-embed are comparing it to the libraries listed below
Sorting:
- ☆13Mar 11, 2025Updated 11 months ago
- Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …☆22Feb 7, 2026Updated 3 weeks ago
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆24Nov 12, 2025Updated 3 months ago
- A wrapper for Audeering's wav2vec-based dimensional speech emotion recognition☆21Aug 9, 2023Updated 2 years ago
- Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming☆15Aug 20, 2024Updated last year
- KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…☆12Mar 24, 2023Updated 2 years ago
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 8 months ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 3 months ago
- ☆11Nov 7, 2024Updated last year
- [ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder☆12Mar 11, 2025Updated 11 months ago
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 4 years ago
- ☆11Nov 5, 2021Updated 4 years ago
- ☆10Sep 19, 2022Updated 3 years ago
- ☆15Nov 11, 2024Updated last year
- ☆14Nov 26, 2024Updated last year
- Transformer based ASR Engine.☆13Aug 23, 2021Updated 4 years ago
- ☆14Jun 16, 2023Updated 2 years ago
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆18Oct 2, 2024Updated last year
- ☆15Feb 6, 2026Updated 3 weeks ago
- Train a fiwGAN or ciwGAN model using your own training data☆14Oct 13, 2022Updated 3 years ago
- An open-source Kazakh Emotional Text-to-Speech Dataset☆35Aug 1, 2025Updated 7 months ago
- This is the repository for the work "BridgeVoC: Revitalizing Neural Vocoder from a Restoration Perspective".☆64Nov 5, 2025Updated 3 months ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- ☆30Jan 22, 2026Updated last month
- Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"☆21Jun 7, 2025Updated 8 months ago
- Indonesian speech/phoneme recognizer powered by Kaldi 2.0 (lhotse, icefall, sherpa).☆15Jun 30, 2023Updated 2 years ago
- Read articles, explore effectiveness metrics for speech enhancement methodologies. Seamlessly integrate code implementations for better u…☆26Apr 19, 2024Updated last year
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15May 16, 2025Updated 9 months ago
- [ICASSP 2024] Official code for FreGrad☆35May 13, 2024Updated last year
- A vector DB so easy, even your grandparents can build a RAG system 😁☆18Jul 18, 2025Updated 7 months ago
- ☆23Jan 29, 2026Updated last month
- ESLTTS dataset☆16Feb 6, 2025Updated last year
- 🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨☆135Aug 10, 2025Updated 6 months ago
- [Interspeech 2025] Official implementation of "Training-Free Voice Conversion with Factorized Optimal Transport"☆43Sep 24, 2025Updated 5 months ago
- Python package of MP-SENet from Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.☆21Nov 1, 2024Updated last year
- This repository contains the training code from paper "SpidR Learning Fast and Stable Linguistic Units for Spoken Language Models Without…☆50Feb 4, 2026Updated last month
- [Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement☆43Jul 25, 2025Updated 7 months ago
- This is the code of the ICASSP 2020 paper "Joint phoneme alignment and text-informed speech separation on highly corrupted speech"☆15Apr 8, 2024Updated last year
- Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.☆39Mar 4, 2024Updated 2 years ago