JJWRoeloffs / transcribe_align_textgridLinks
A small wrapper package around whisper-timestamped. Create force-aligned transcription TextGrids from raw audio!
☆18Updated last week
Alternatives and similar repositories for transcribe_align_textgrid
Users that are interested in transcribe_align_textgrid are comparing it to the libraries listed below
Sorting:
- SelfRemaster: SSL Speech Restoration☆93Updated last year
- a lightweight voice conversion☆85Updated last year
- X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion☆109Updated last year
- Official Code for ParrotTTS☆58Updated last year
- ☆80Updated 4 months ago
- AudioSR-Upsampling (any -> 48kHz)☆42Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆56Updated 6 months ago
- High-Fidelity Neural Phonetic Posteriorgrams☆120Updated 10 months ago
- Putting flows on top of neural transducers for better TTS☆64Updated 2 weeks ago
- Streaming Audiotransformers for online Audio tagging☆49Updated last year
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated 2 years ago
- Deep Neural Pitch Extractor for Voice Conversion and TTS Training☆145Updated 3 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago
- ☆93Updated last year
- This is the official implementation of our multi-channel multi-speaker multi-spatial neural audio codec architecture.☆50Updated 9 months ago
- iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform☆268Updated 5 months ago
- Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.☆222Updated 2 years ago
- Contains the code associated with the ICLR submission for our text-to-speech diffusion model☆54Updated 2 years ago
- ☆93Updated last month
- Implementation of Emo-StarGAN☆45Updated 2 years ago
- Add n-gram and large language model (LLM) support to Whisper models.☆39Updated 7 months ago
- ☆69Updated last year
- ☆24Updated 7 months ago
- An official implementation of the ICASSP 2024 paper: Dual-Path TFC-TDF UNet for Music Source Separation☆98Updated last year
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆95Updated last year
- Codebase for the paper 'EncodecMAE: Leveraging neural codecs for universal audio representation learning'☆99Updated last year
- Companion repo for the paper "PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings…☆100Updated 11 months ago
- An implementation for Frame-level Speech Signal-to-Noise Ratio Estimation using deep learning☆42Updated 3 years ago
- Apply Score diffusion to improve speech signals recorded under various adverse conditions and distortions, including noise, reverberation…☆73Updated last year
- ☆58Updated last year