☆60Jan 8, 2025Updated last year
Alternatives and similar repositories for audio-foundation-model-dataset
Users that are interested in audio-foundation-model-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A repository of Japanese Phoneme-Level BERT☆24Dec 16, 2023Updated 2 years ago
- 日本音響学会誌用BibTeXスタイルファイル☆11Jan 24, 2022Updated 4 years ago
- Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications☆90Dec 20, 2024Updated last year
- Unofficial implementation of wavenext vocoder☆60Aug 28, 2024Updated last year
- Coco-Nut (Corpus of connecting NIHONGO utterance and text) corpus☆21Jun 12, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- JATTS: A modern, research-oriented Japanese Text-to-speech Open-sourced Toolkit☆44Mar 13, 2026Updated 2 weeks ago
- This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models☆35Oct 13, 2024Updated last year
- PyTorch implementation of WaveFit [2022, Google] which is one of SOTA lightweight/fast speech vocoders.☆64Sep 8, 2025Updated 6 months ago
- Versatile Evaluation of Speech and Audio☆396Dec 9, 2025Updated 3 months ago
- Survey of audio language models☆63Mar 17, 2026Updated last week
- xvector model on jtubespeech☆47Nov 5, 2023Updated 2 years ago
- ☆10May 16, 2024Updated last year
- source code of EfficientTTS 2☆20Feb 18, 2024Updated 2 years ago
- Forced alignment decoder for Whisper.☆15Mar 13, 2024Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Crowdsourced and Automatic Speech Prominence Estimation☆26Apr 12, 2024Updated last year
- My vocoder experiments☆31Jul 26, 2025Updated 8 months ago
- Reimplementation of Miipher☆29Aug 16, 2023Updated 2 years ago
- UT-Sarulab MOS prediction system using SSL models☆297Apr 11, 2024Updated last year
- pyopenjtalk-plus: A Python wrapper for OpenJTalk with additional improvements☆56Mar 22, 2026Updated last week
- LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning☆159Jun 13, 2024Updated last year
- DDPM-based Pitch Generation and Pitch Controllable Voice Synthesis.☆54Sep 25, 2023Updated 2 years ago
- A simple and user-friendly tool for computing STFT/DGT☆19Jun 22, 2021Updated 4 years ago
- SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"☆37Aug 29, 2023Updated 2 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆24Aug 1, 2025Updated 7 months ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆152Sep 14, 2023Updated 2 years ago
- CTC decoder with hotwords for ASR.☆35Apr 13, 2025Updated 11 months ago
- ☆49Jul 22, 2024Updated last year
- ☆36Sep 20, 2022Updated 3 years ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 11 months ago
- ひたすら楽して音響信号解析:プログラム置き場☆27Jan 25, 2024Updated 2 years ago
- Speech Human Evaluation Estimation Toolkit (SHEET)☆134Updated this week
- PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…☆39Jan 6, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Multi-lingual AudioCaps☆12Nov 20, 2023Updated 2 years ago
- HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis☆45Mar 2, 2021Updated 5 years ago
- ☆13Oct 11, 2024Updated last year
- Speaker embedding for anime speech domain based on ECAPA_TDNN☆17Jun 22, 2025Updated 9 months ago
- tdmelodic for open-jtalk☆25Aug 30, 2021Updated 4 years ago
- ☆25Jan 24, 2023Updated 3 years ago
- Acoustic measurement using music pieces☆12Aug 5, 2022Updated 3 years ago