Speaker adaptive forced alignment (phonetic segmentation) using Wav2Vec2
☆23May 7, 2026Updated last month
Alternatives and similar repositories for Wav2TextGrid
Users that are interested in Wav2TextGrid are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Sound field reconstruction using neural processes with dynamic kernels☆16Mar 25, 2025Updated last year
- This was a fun project to explore what an AI would do with the ability to give itself prompts, ignore user requests, and wake itself up a…☆36Oct 28, 2025Updated 8 months ago
- Generator for anechoic, non-stationary noise signals☆11Aug 12, 2022Updated 3 years ago
- Openfst mirror with some fixes☆16Aug 23, 2024Updated last year
- MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations☆41Oct 15, 2025Updated 8 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆37Aug 30, 2025Updated 10 months ago
- OpenFLAM: Framewise Language Audio Model☆108Jun 4, 2026Updated last month
- Make Praat Picture style plots of acoustic data☆37Apr 22, 2026Updated 2 months ago
- Praat script for automatic formant optimization☆15Jan 27, 2023Updated 3 years ago
- [ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?☆47Nov 21, 2025Updated 7 months ago
- Googleの音声復元モデルMiipher-2の再現実装の学習および推論コード。学習済みモデルも公開しています。☆31Feb 7, 2026Updated 4 months ago
- The baselines of ARC-Challenge-Interspeech2026☆60Dec 1, 2025Updated 7 months ago
- Research on Automatic Speech Recognition for dysarthric speech☆20Oct 9, 2024Updated last year
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆21Nov 3, 2025Updated 8 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- PyTorch implementation of Near-Lossless Post-Training Quantization of Deep Neural Networks via a Piecewise Linear Approximation☆23Feb 17, 2020Updated 6 years ago
- This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.☆52Oct 11, 2025Updated 8 months ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated last year
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 7 months ago
- GAN to create abstract art☆23Jan 19, 2021Updated 5 years ago
- Speech enhancement by time-varying pitch-dependent filtering of harmonics☆27Jul 3, 2014Updated 12 years ago
- ☆71Updated this week
- A codebase for data crawling and preprocessing for TTS and ASR systems training.☆23Jun 13, 2026Updated 3 weeks ago
- Towards a general language-audio model for computational paralinguistic tasks☆30Dec 14, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- Code for the blog "Neural audio codecs: how to get audio into LLMs"☆174Oct 20, 2025Updated 8 months ago
- Text-to-text alignment algorithm for speech recognition error analysis.☆31Jun 23, 2026Updated last week
- Music2Emo: Towards Unified Music Emotion Recognition across Dimensional and Categorical Models☆54Aug 24, 2025Updated 10 months ago
- Official Repository of Paper: "SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding" (IC…☆71Apr 27, 2026Updated 2 months ago
- A visualization and transformation of pytorch model☆30Jan 8, 2020Updated 6 years ago
- Prosody and Pronunciation Modification Network☆64May 5, 2025Updated last year
- Feed-forward compressor experiments source code for "Differentiable All-pole Filters for Time-varying Audio Systems".☆24Jun 10, 2024Updated 2 years ago
- We implemented the DEMUCS model for speech enhancement in the time-frequency domain, and additionally implemented HD-DEMUCS.☆34Nov 8, 2023Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆14Nov 30, 2022Updated 3 years ago
- ☆41May 12, 2026Updated last month
- ☆40May 12, 2025Updated last year
- [WIP]Direction based Multi-Channel Speech Separation☆14Jan 25, 2024Updated 2 years ago
- FBX Book☆11Oct 17, 2022Updated 3 years ago
- Pytorch implementation of MDensenet and sparse NMF. Made for my undergraduate thesis "Music Source Separation with Supervised Learning Me…☆11Jan 31, 2021Updated 5 years ago
- The source code for the paper CrossSinger (asru2023)☆18Oct 12, 2023Updated 2 years ago