A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.
☆29Mar 14, 2025Updated last year
Alternatives and similar repositories for allophant
Users that are interested in allophant are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Dec 22, 2023Updated 2 years ago
- Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three differen…☆261May 9, 2022Updated 3 years ago
- phone inventory library☆17May 15, 2023Updated 2 years ago
- Universal multilingual automatic speech transcription into IPA☆77Feb 28, 2025Updated last year
- REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR☆14Dec 11, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Estimating the Age, Height, and Gender of a speaker with their speech signal.☆14Sep 19, 2022Updated 3 years ago
- Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION☆46May 12, 2023Updated 2 years ago
- A family of efficient speech models for multilingual phone recognition☆49Feb 12, 2026Updated last month
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 4 months ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 11 months ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆20Apr 10, 2025Updated 11 months ago
- DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast…☆53Mar 20, 2026Updated last week
- [ACL 2025] OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching☆45Feb 9, 2025Updated last year
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 4 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆18Mar 17, 2025Updated last year
- (WIP)long form speech generatoins☆31Apr 2, 2025Updated 11 months ago
- Training code and dataset cleasing with Sidon☆87Jan 16, 2026Updated 2 months ago
- a simple system for 2-way interruptible voice interactions between human and LLM☆30Feb 18, 2024Updated 2 years ago
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆45Mar 17, 2026Updated last week
- Official Code for SyllableLM: Learning Coarse Semantic Units for Speech Language Models☆61Jul 1, 2025Updated 8 months ago
- A phoneme-allophone database for many languages☆53May 19, 2020Updated 5 years ago
- A neural speech codec based on discrete WavLM representations☆25Aug 28, 2024Updated last year
- ☆52Jun 24, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆175Jun 9, 2023Updated 2 years ago
- A collection of utilities for handling IPA phones.☆26Sep 24, 2023Updated 2 years ago
- ☆14Aug 19, 2024Updated last year
- unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"☆15Nov 14, 2023Updated 2 years ago
- Official release of pretrained models and codes for 'Golden Gemini Is All You Need: Finding the Sweet Spots for Speaker Verification'☆15Jan 20, 2025Updated last year
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 7 months ago
- ☆20Sep 20, 2024Updated last year
- Converts Mandarin Chinese pinyin notation to IPA (international phonetic alphabet) notation☆18Nov 28, 2023Updated 2 years ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆69Nov 1, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Vocal Tract Area Estimation by Gradient Descent☆38Jul 16, 2023Updated 2 years ago
- ☆28Sep 5, 2024Updated last year
- Variable Bitrate Residual Vector Quantization for Audio Coding☆50May 1, 2025Updated 10 months ago
- LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …☆93Dec 28, 2024Updated last year
- Repository for multilingual speech data resources for native languages of Zambia.☆20Oct 9, 2024Updated last year
- ☆26Nov 2, 2022Updated 3 years ago
- Attention-Enhanced Short-Time Wiener Solution for Acoustic Echo Cancellation☆26Nov 12, 2025Updated 4 months ago