repodiac / german_transliterateLinks
Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to clean messy text (e.g. map peculiar Unicode encodings to ASCII) or replace common abbreviations in text in combination with various text mining tasks.
☆34Updated 4 years ago
Alternatives and similar repositories for german_transliterate
Users that are interested in german_transliterate are comparing it to the libraries listed below
Sorting:
- This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…☆32Updated 2 years ago
- ☆80Updated 2 months ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆76Updated 4 years ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆68Updated 5 months ago
- Python library for manipulating pronunciations using the International Phonetic Alphabet (IPA)☆95Updated last year
- ☆37Updated 6 months ago
- multilingual speech aligner☆77Updated last year
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆62Updated last year
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆170Updated 2 years ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆74Updated 5 years ago
- Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION☆43Updated 2 years ago
- Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.☆87Updated 3 years ago
- Various speech datasets made available to the public☆131Updated 10 months ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆177Updated last week
- Linguistic processing for Common Voice☆57Updated last year
- ☆39Updated 3 years ago
- Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup☆76Updated 3 months ago
- S3PRL-VC: A Voice Conversion Toolkit based on S3PRL☆101Updated last year
- ☆65Updated last year
- SelfRemaster: SSL Speech Restoration☆90Updated last year
- Data and code for grapheme-to-phoneme transducers in lots of languages☆140Updated last year
- The VoxTube dataset official repository☆70Updated last year
- Pronunciation-assisted Subword Modeling☆31Updated 6 years ago
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆163Updated last year
- Unofficial implementation of miipher☆133Updated last year
- Online streaming speaker change detection model in Pytorch☆42Updated 2 years ago
- Python wrapper for kaldi's arpa2fst☆38Updated 2 months ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆107Updated 2 years ago
- Qualtric or Qualtreat? Generate Qualtrics listening tests for Text-To-Speech evaluations.☆36Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Updated 2 years ago