repodiac / german_transliterate
Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to clean messy text (e.g. map peculiar Unicode encodings to ASCII) or replace common abbreviations in text in combination with various text mining tasks.
☆32Updated 4 years ago
Alternatives and similar repositories for german_transliterate:
Users that are interested in german_transliterate are comparing it to the libraries listed below
- This is the official repository for the HUI-Audio-Corpus-German. The corresponding paper is in the process of publication. With the repo…☆30Updated 2 years ago
- ☆79Updated 11 months ago
- ☆62Updated 11 months ago
- Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing☆70Updated 2 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆112Updated 2 years ago
- ☆28Updated last year
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Updated 3 years ago
- multilingual speech aligner☆74Updated last year
- CML-TTS: A Multilingual Dataset for Speech Synthesis☆31Updated 8 months ago
- Python wrappers for Kaldi Levenshtein's distance and alignment code.☆64Updated last year
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSI…☆21Updated 7 months ago
- Finetuning VITS Efficiently☆32Updated last year
- [IJCAI'23] Learning to Speak from Text for Low-Resource TTS☆63Updated last year
- Deep Neural Pitch Extractor for Voice Conversion and TTS Training☆123Updated 2 years ago
- Convert English text from written expressions into spoken forms☆25Updated 2 years ago
- A toolkit to calculate speech audio quality. Not affiliated with the original authors☆50Updated 8 months ago
- PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis☆56Updated 3 years ago
- Official implementation of the paper "Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus" acc…☆75Updated last year
- SelfRemaster: SSL Speech Restoration☆88Updated last year
- VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.☆51Updated 11 months ago
- Segment a given audio into utterances using a trained end-to-end ASR model.☆73Updated 4 years ago
- ☆71Updated last year
- PitchVC: Pitch Conditioned Any-to-Many Voice Conversion☆34Updated 10 months ago
- Linguistic processing for Common Voice☆55Updated last year
- python code for converting among IPA, ARPABET, XSAMPA, Callhome, DISC, TIMIT, plus some lexical tones.☆33Updated last year
- Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup☆71Updated 7 months ago
- A collection of utilities for handling IPA phones.☆25Updated last year
- Dataset of ICASSP 2021 MULTILINGUAL PHONETIC DATASET FOR LOW RESOURCE SPEECH RECOGNITION☆42Updated last year
- S3PRL-VC: A Voice Conversion Toolkit based on S3PRL☆99Updated 10 months ago
- ☆42Updated 3 years ago