oliverguhr / deepmultilingualpunctuationLinks
A python package for deep multilingual punctuation prediction.
β131Updated last year
Alternatives and similar repositories for deepmultilingualpunctuation
Users that are interested in deepmultilingualpunctuation are comparing it to the libraries listed below
Sorting:
- A model that predicts the punctuation of English, Italian, French and German texts.β80Updated 2 years ago
- πAn easy-to-use package to restore punctuation of the text.β119Updated 2 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisperβ118Updated 2 years ago
- Various speech datasets made available to the publicβ130Updated 9 months ago
- Timething is a library for aligning text transcripts with their audio recordings.β122Updated 9 months ago
- Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of codeβ150Updated last year
- Punctuation Restoration using Transformer Models for High-and Low-Resource Languagesβ221Updated last year
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!β174Updated this week
- Segment an audio file and obtain utterance alignments. (Python package)β341Updated last year
- Universal Romanizer that can convert any unicode script to roman (latin) scriptβ221Updated last year
- A tokenizer, text cleaner, and phonemizer for many human languages.β325Updated 10 months ago
- β38Updated 3 years ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decodingβ76Updated 3 years ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languagesβ169Updated 2 years ago
- β43Updated 2 years ago
- Finetune VITS and MMS using HuggingFace's toolsβ163Updated last year
- β87Updated last month
- Model for recasing and repunctuating ASR transcriptsβ138Updated last year
- β37Updated 4 months ago
- Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used toβ¦β33Updated 4 years ago
- Multilingual G2P in 100 languagesβ355Updated 2 years ago
- β359Updated last year
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.β82Updated 2 years ago
- β200Updated 3 years ago
- Reproducible experimental protocols for multimedia (audio, video, text) databaseβ107Updated this week
- A merged version of multiple open-source German speech datasets.β33Updated last year
- β47Updated 2 years ago
- Support tools for punctuation and boundary detection for ASR output.β56Updated 2 years ago
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translationβ148Updated last year
- A non-native English corpus for pronunciation scoring taskβ150Updated last year