Speaker adaptive forced alignment (phonetic segmentation) using Wav2Vec2
☆23Mar 5, 2026Updated 2 weeks ago
Alternatives and similar repositories for Wav2TextGrid
Users that are interested in Wav2TextGrid are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Sound field reconstruction using neural processes with dynamic kernels☆16Mar 25, 2025Updated last year
- This was a fun project to explore what an AI would do with the ability to give itself prompts, ignore user requests, and wake itself up a…☆35Oct 28, 2025Updated 4 months ago
- Generator for anechoic, non-stationary noise signals☆11Aug 12, 2022Updated 3 years ago
- MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations☆34Oct 15, 2025Updated 5 months ago
- Openfst mirror with some fixes☆15Aug 23, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder☆35Aug 30, 2025Updated 6 months ago
- OpenFLAM: Framewise Language Audio Model☆101Jan 14, 2026Updated 2 months ago
- [ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?☆44Nov 21, 2025Updated 4 months ago
- Make Praat Picture style plots of acoustic data☆37Feb 4, 2026Updated last month
- Praat script for automatic formant optimization☆15Jan 27, 2023Updated 3 years ago
- Googleの音声復元モデルMiipher-2の再現実装の学習および推論コード。学習済みモデルも公開しています。☆31Feb 7, 2026Updated last month
- This is the official repository of ``Scalable Neural Vocoder from Range-Null Space Decomposition'', which is submitted to TPAMI.☆47Oct 11, 2025Updated 5 months ago
- The baselines of ARC-Challenge-Interspeech2026☆57Dec 1, 2025Updated 3 months ago
- Research on Automatic Speech Recognition for dysarthric speech☆19Oct 9, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Implementation of "Look, Listen and Recognise:character-aware audio-visual subtitling"☆20Nov 3, 2025Updated 4 months ago
- PyTorch implementation of Near-Lossless Post-Training Quantization of Deep Neural Networks via a Piecewise Linear Approximation☆23Feb 17, 2020Updated 6 years ago
- Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"☆11Apr 10, 2025Updated 11 months ago
- T5Voice is a lightweight PyTorch implementation of T5-based text-to-speech synthesis, supporting both streaming and non-streaming speech …☆28Nov 7, 2025Updated 4 months ago
- GAN to create abstract art☆23Jan 19, 2021Updated 5 years ago
- Speech enhancement by time-varying pitch-dependent filtering of harmonics☆27Jul 3, 2014Updated 11 years ago
- Towards a general language-audio model for computational paralinguistic tasks☆24Dec 14, 2024Updated last year
- ☆68Dec 30, 2025Updated 2 months ago
- A codebase for data crawling and preprocessing for TTS and ASR systems training.☆22Feb 26, 2026Updated 3 weeks ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Spherical residual vector quantization (SRVQ)☆31Aug 25, 2024Updated last year
- Demo for DART, Audio Imagination workshop submission in NeurIPS 2024☆13Apr 15, 2025Updated 11 months ago
- Code for the blog "Neural audio codecs: how to get audio into LLMs"☆159Oct 20, 2025Updated 5 months ago
- Official Repository of Paper: "SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding" (IC…☆67Jan 27, 2026Updated last month
- Music2Emo: Towards Unified Music Emotion Recognition across Dimensional and Categorical Models☆47Aug 24, 2025Updated 7 months ago
- A visualization and transformation of pytorch model☆30Jan 8, 2020Updated 6 years ago
- Prosody and Pronunciation Modification Network☆63May 5, 2025Updated 10 months ago
- Feed-forward compressor experiments source code for "Differentiable All-pole Filters for Time-varying Audio Systems".☆22Jun 10, 2024Updated last year
- We implemented the DEMUCS model for speech enhancement in the time-frequency domain, and additionally implemented HD-DEMUCS.☆33Nov 8, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆14Nov 30, 2022Updated 3 years ago
- This is the official implementation for the paper "Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-…☆29Feb 8, 2026Updated last month
- ☆39May 12, 2025Updated 10 months ago
- [WIP]Direction based Multi-Channel Speech Separation☆14Jan 25, 2024Updated 2 years ago
- FBX Book☆11Oct 17, 2022Updated 3 years ago
- Pytorch implementation of MDensenet and sparse NMF. Made for my undergraduate thesis "Music Source Separation with Supervised Learning Me…☆11Jan 31, 2021Updated 5 years ago
- Fast algorithm for determined blind source separation with update of demixing filters with joint adjustment of the remaining sources.☆35Mar 22, 2021Updated 5 years ago