repodiac / german_transliterate
Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to clean messy text (e.g. map peculiar Unicode encodings to ASCII) or replace common abbreviations in text in combination with various text mining tasks.
☆32Updated 4 years ago
Alternatives and similar repositories for german_transliterate:
Users that are interested in german_transliterate are comparing it to the libraries listed below
- This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…☆30Updated 2 years ago
- ☆80Updated 10 months ago
- ☆62Updated 10 months ago
- Convert English text from written expressions into spoken forms☆24Updated 2 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated last year
- ☆75Updated 3 years ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Updated 3 years ago
- Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.☆88Updated 2 years ago
- Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION☆40Updated last year
- multilingual speech aligner☆72Updated last year
- ☆36Updated 6 months ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆50Updated 7 months ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆156Updated last year
- ☆71Updated last year
- Monotonic Alignment Search☆90Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆101Updated 2 years ago
- VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.☆49Updated 10 months ago
- A collection of utilities for handling IPA phones.☆25Updated last year
- Linguistic processing for Common Voice☆55Updated last year
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆70Updated 2 years ago
- Qualtric or Qualtreat? Generate Qualtrics listening tests for Text-To-Speech evaluations.☆35Updated 9 months ago
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆157Updated this week
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆63Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆48Updated 9 months ago
- ☆38Updated 3 years ago
- Labeled data for homograph disambiguation☆57Updated last year
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆44Updated 3 years ago
- Prosodic Speech Segmentation with Transformers☆25Updated last year
- PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS☆22Updated 3 years ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆95Updated 5 months ago