feldberlin / timething
Timething is a library for aligning text transcripts with their audio recordings.
☆117Updated 4 months ago
Alternatives and similar repositories for timething:
Users that are interested in timething are comparing it to the libraries listed below
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆157Updated this week
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆156Updated last year
- Python forced alignment☆87Updated 11 months ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆111Updated 2 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) database☆98Updated last month
- Segment an audio file and obtain utterance alignments. (Python package)☆333Updated 10 months ago
- A python package for deep multilingual punctuation prediction.☆120Updated 7 months ago
- ☆80Updated 10 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆192Updated 6 months ago
- ☆75Updated last year
- Charsiu: A neural phonetic aligner.☆295Updated 2 years ago
- Universal multilingual automatic speech transcription into IPA☆62Updated last month
- Data and code for grapheme-to-phoneme transducers in lots of languages☆132Updated 11 months ago
- Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions☆243Updated 2 months ago
- Python library for manipulating pronunciations using the International Phonetic Alphabet (IPA)☆88Updated last year
- Collection of pretrained models for the Montreal Forced Aligner☆139Updated 8 months ago
- A sequence-to-sequence voice conversion toolkit.☆96Updated 8 months ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆143Updated last year
- Multilingual G2P in 100 languages☆315Updated last year
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code☆147Updated 11 months ago
- Simple Diarization model☆47Updated last year
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 5 months ago
- Uses ctypes and libespeak-ng to transform test into IPA phonemes☆20Updated last year
- Easy-to-Use Speech MOS predictors☆272Updated last year
- A speaker embedding network in Pytorch that is very quick to set up and use for whatever purposes.☆87Updated this week
- Your one-stop solution for voice dataset creation☆118Updated last year
- A phoneme-allophone database for many languages☆52Updated 4 years ago
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆150Updated last year
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆60Updated this week
- It's a repository for implementations of neural speech editing algorithms.☆195Updated last year