apple/ml-acn-embed

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/apple/ml-acn-embed)

apple / ml-acn-embed

Acoustic Neighbor Embeddings

☆33

Alternatives and similar repositories for ml-acn-embed

Users that are interested in ml-acn-embed are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xjuspeech / YOLOPitch
View on GitHub
☆10Jun 11, 2024Updated 2 years ago
JusperLee / Gull-Codec-Training
View on GitHub
☆12Mar 11, 2025Updated last year
meaningTeam / tidy-tunes
View on GitHub
Tidy Tunes is an easy-to-use pipeline for mining high-quality audio data for speech generation models. To do so, it chains multiple open …
☆23May 19, 2026Updated 2 months ago
mushanshanshan / ESLTTS
View on GitHub
ESLTTS dataset
☆16Feb 6, 2025Updated last year
TehreemFarooqi / Preparing-a-speech-recognition-dataset-using-YouTube-videos
View on GitHub
Using YouTube to prepare a speech recognition dataset for any language
☆10Mar 30, 2021Updated 5 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
fanlu / wenet
View on GitHub
Transformer based ASR Engine.
☆13Aug 23, 2021Updated 4 years ago
ETH-DISCO / audio-atlas
View on GitHub
☆15Feb 6, 2026Updated 5 months ago
llm-lab-org / CLASP
View on GitHub
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
☆13Jun 27, 2025Updated last year
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
bagustris / w2v2-vad
View on GitHub
A wrapper for Audeering's wav2vec-based dimensional speech emotion recognition
☆22Aug 9, 2023Updated 2 years ago
bshall / dusted
View on GitHub
DUSTED: Spoken-Term Discovery using Discrete Speech Units
☆17Oct 2, 2024Updated last year
rishiraj / gam
View on GitHub
This is the official PyTorch implementation for the paper "Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence …
☆16Sep 3, 2025Updated 10 months ago
xiaoxue1117 / speech-mamba-public
View on GitHub
☆15Nov 26, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Chung-I / youtube-asr-crawler
View on GitHub
☆10Sep 19, 2022Updated 3 years ago
pengzhendong / asr-decoder
View on GitHub
CTC decoder with hotwords for ASR.
☆38Jun 15, 2026Updated last month
miccio-dk / NISQA
View on GitHub
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
☆16Apr 13, 2022Updated 4 years ago
freds0 / kabooks
View on GitHub
KABooks is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. Using a…
☆13Mar 24, 2023Updated 3 years ago
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
johnmartinsson / differentiable-mel-spectrogram
View on GitHub
The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …
☆24Dec 21, 2024Updated last year
WingSingFung / TISDiSS
View on GitHub
Official implementation of TISDiSS, a scalable framework for discriminative source separation.
☆16Oct 19, 2025Updated 9 months ago
samsad35 / code-ancogen
View on GitHub
[ICASSP 2025] AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder
☆14Mar 11, 2025Updated last year
JosefAlbers / e2tts-mlx
View on GitHub
Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX
☆29Oct 15, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
kaistmm / fregrad
View on GitHub
[ICASSP 2024] Official code for FreGrad
☆35May 13, 2024Updated 2 years ago
altaidevorg / letsearch
View on GitHub
A vector DB so easy, even your grandparents can build a RAG system 😁
☆25Apr 1, 2026Updated 3 months ago
NUS-HPC-AI-Lab / MoST
View on GitHub
MoST: Mixing Speech and Text with Modality-Aware Mixture of Experts
☆33Jan 15, 2026Updated 6 months ago
bookbot-hive / k2-indonesian-asr
View on GitHub
Indonesian speech/phoneme recognizer powered by Kaldi 2.0 (lhotse, icefall, sherpa).
☆16Jun 30, 2023Updated 3 years ago
lavendery / UUG
View on GitHub
☆21Sep 14, 2025Updated 10 months ago
IS2AI / KazEmoTTS
View on GitHub
An open-source Kazakh Emotional Text-to-Speech Dataset
☆36Aug 1, 2025Updated 11 months ago
kamperh / vqwordseg
View on GitHub
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.
☆39May 5, 2026Updated 2 months ago
SonyResearch / VRVQ
View on GitHub
Variable Bitrate Residual Vector Quantization for Audio Coding
☆54May 1, 2025Updated last year
hhguo / SoCodec
View on GitHub
Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications
☆92Dec 20, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
google / airdialogue_model
View on GitHub
☆17Jul 16, 2020Updated 6 years ago
gbegus / DeepPhonologyTool
View on GitHub
Train a fiwGAN or ciwGAN model using your own training data
☆14Oct 13, 2022Updated 3 years ago
felixperfler / Stable-Hybrid-Auditory-Filterbanks
View on GitHub
[Interspeech 2024] Hold Me Tight: Stable Encoder-Decoder Design for Speech Enhancement
☆43Jul 25, 2025Updated last year
YoshikiMas / madeon-asr
View on GitHub
[SLT'24] Mamba-based Decoder-Only Approach for Speech Recognition
☆19Dec 1, 2024Updated last year
ZhaoF-i / ASTWS-AEC
View on GitHub
Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation
☆31Nov 12, 2025Updated 8 months ago
Alittleegg / Eureka-Audio
View on GitHub
Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasonin…
☆40Apr 11, 2026Updated 3 months ago
ZehuaKcrissLi / GTR-Voice
View on GitHub
☆16Nov 11, 2024Updated last year