r4victor / afaligner
📈 A forced aligner intended for synchronization of narrated text
☆81Updated last year
Related projects: ⓘ
- 📖🎧 A tool for creating ebooks with synchronized text and audio (EPUB3 with Media Overlays)☆259Updated 8 months ago
- Timething is a library for aligning text transcripts with their audio recordings.☆92Updated 10 months ago
- Synchronize Whisper's timestamps over an existing accurate transcription☆124Updated 3 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆95Updated last year
- 📦 A collection of files for LibriVox recordings to produce ebooks with synchronized text and audio☆22Updated 4 years ago
- Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language det…☆172Updated 3 months ago
- An even smaller speech recognizer / force aligner☆32Updated last month
- A PyTorch demo of the paper Voice Separation with an Unknown Number of Multiple Speakers using gradio and Nvidia NEMO ASR model.☆31Updated 8 months ago
- A testing repo to share code and thoughts on diarisation☆50Updated 5 months ago
- 🔮 Word by word audio subtitle synchronisation tool and API. Developed under GSoC 2017 with CCExtractor.☆165Updated 4 years ago
- Package for aligning audio files through audio fingerprinting☆91Updated 3 months ago
- pronunciation dictionaries for multiple languages☆83Updated 7 years ago
- Python library for manipulating pronunciations using the International Phonetic Alphabet (IPA)☆78Updated 9 months ago
- Python interface to ISLEX, an English IPA pronunciation dictionary with syllable and stress marking.☆46Updated 8 months ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts☆262Updated 7 months ago
- A python package for deep multilingual punctuation prediction.☆87Updated 3 weeks ago
- Audiobook alignment for Indigenous languages☆34Updated this week
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆127Updated this week
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text☆35Updated 4 years ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆132Updated 4 months ago
- ☆26Updated 3 months ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.☆132Updated last year
- Workflow for forced alignment between languages☆17Updated 7 months ago
- web based editor for subtitles and transcripts☆102Updated last month
- The CMU Pronouncing Dictionary converted to IPA☆74Updated 5 years ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆138Updated last year
- Simple Diarization model☆40Updated 9 months ago
- ez audio transcription tool with flexible processing and post-processing options☆122Updated 7 months ago
- Postprocess SRT derived speech alignments for creating clean datasets for machine learning☆17Updated last year
- ☆70Updated last year