r4victor / afaligner
π A forced aligner intended for synchronization of narrated text
β85Updated last year
Related projects β
Alternatives and complementary repositories for afaligner
- Timething is a library for aligning text transcripts with their audio recordings.β103Updated last year
- ππ§ A tool for creating ebooks with synchronized text and audio (EPUB3 with Media Overlays)β274Updated 10 months ago
- π¦ A collection of files for LibriVox recordings to produce ebooks with synchronized text and audioβ24Updated 4 years ago
- Synchronize Whisper's timestamps over an existing accurate transcriptionβ132Updated 5 months ago
- Python forced alignmentβ73Updated 7 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisperβ99Updated last year
- An even smaller speech recognizer / force alignerβ32Updated last week
- Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detβ¦β196Updated last week
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!β135Updated this week
- Audiobook alignment for Indigenous languagesβ38Updated this week
- A python package for deep multilingual punctuation prediction.β98Updated 3 months ago
- Go from raw audio files to a text-audio dataset automatically with OpenAI's Whisper.β133Updated last year
- ez audio transcription tool with flexible processing and post-processing optionsβ130Updated 9 months ago
- web based editor for subtitles and transcriptsβ112Updated 3 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ141Updated 6 months ago
- A testing repo to share code and thoughts on diarisationβ53Updated 7 months ago
- Python library for manipulating pronunciations using the International Phonetic Alphabet (IPA)β80Updated last year
- A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented textβ36Updated 4 years ago
- β72Updated last year
- phoneme tokenizer and grapheme-to-phoneme model for 8k languagesβ144Updated last year
- Community framework for training tortoiseβ38Updated 2 years ago
- A PyTorch demo of the paper Voice Separation with an Unknown Number of Multiple Speakers using gradio and Nvidia NEMO ASR model.β33Updated 10 months ago
- Script to split video files into chunks based on .srt timecodesβ31Updated 6 years ago
- Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated textsβ275Updated last week
- Performant and accurate speech recognition built on Pytorchβ248Updated 2 years ago
- Accelerating faster-whisper single file processing by multiprocessing through parallelizationβ51Updated last year
- Hyperaudio Lite - a Super-lightweight Interactive Transcript Playerβ127Updated this week
- Model for recasing and repunctuating ASR transcriptsβ129Updated 7 months ago
- The EveryVoice TTS Toolkit - Text To Speech for your languageβ21Updated this week
- Text to speech alignment using CTC forced alignmentβ137Updated 3 weeks ago