repodiac / german_transliterate
Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to clean messy text (e.g. map peculiar Unicode encodings to ASCII) or replace common abbreviations in text in combination with various text mining tasks.
☆32Updated 4 years ago
Alternatives and similar repositories for german_transliterate:
Users that are interested in german_transliterate are comparing it to the libraries listed below
- This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…☆30Updated last year
- Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.☆88Updated 2 years ago
- ☆80Updated 10 months ago
- Linguistic processing for Common Voice☆55Updated last year
- Deep Neural Pitch Extractor for Voice Conversion and TTS Training☆122Updated 2 years ago
- ☆35Updated 2 weeks ago
- ☆62Updated 10 months ago
- ☆38Updated 3 years ago
- Convert English text from written expressions into spoken forms☆24Updated 2 years ago
- A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.☆101Updated 2 years ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆50Updated 7 months ago
- ☆38Updated 3 years ago
- Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection☆59Updated last month
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Updated 3 years ago
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆70Updated 2 years ago
- Unofficial implementation of miipher☆120Updated 11 months ago
- A collection of utilities for handling IPA phones.☆25Updated last year
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆110Updated 2 years ago
- This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to…☆44Updated 3 years ago
- multilingual speech aligner☆72Updated last year
- Segment a given audio into utterances using a trained end-to-end ASR model.☆73Updated 4 years ago
- This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN…☆74Updated 2 years ago
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated last year
- Avocodo: Generative Adversarial Network for Artifact-free Vocoder☆118Updated 2 years ago
- S3PRL-VC: A Voice Conversion Toolkit based on S3PRL☆99Updated 9 months ago
- Neural HMMs are all you need (for high-quality attention-free TTS)☆158Updated 3 weeks ago
- phoneme tokenizer and grapheme-to-phoneme model for 8k languages☆156Updated last year
- MOS score prediction by fine-tuned wav2vec2.0 model☆156Updated 2 years ago
- An implement of GlowTTS model. Several modes are added: speaker embedding, prosody encoder(GST), and gradient reversal.☆53Updated 2 years ago
- Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION☆40Updated last year