olastor / german-word-frequencies
Simple word to frequency mappings for the german language based on text corpora and using CISTEM stemmer.
☆11Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for german-word-frequencies
- Python interface to ISLEX, an English IPA pronunciation dictionary with syllable and stress marking.☆47Updated 10 months ago
- Audiobook alignment for Indigenous languages☆37Updated this week
- Script for workflow to add morphological analysis into ELAN files☆13Updated 4 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆27Updated 3 years ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆43Updated last year
- A comprehensive list of Arabic NLP resources.☆13Updated 2 weeks ago
- SpeCT - Speech Corpus Toolkit for Praat. Documentation: https://lennes.github.io/spect/☆56Updated last year
- ☆19Updated 3 years ago
- 📈 A forced aligner intended for synchronization of narrated text☆85Updated last year
- 24-hour Automatic Speech Recognition☆27Updated 3 years ago
- Massively multilingual pronunciation mining☆320Updated last month
- python code for converting among IPA, ARPABET, XSAMPA, Callhome, DISC, TIMIT, plus some lexical tones.☆29Updated 9 months ago
- Python module for syllabifying English ARPABET transcriptions☆64Updated 5 years ago
- A set of pipelines for performing experiments on various NLP tasks with a focus on resource-poor/minority languages.☆34Updated this week
- An NLP pipeline for Hebrew☆34Updated 7 months ago
- Small-vocabulary sequence-to-sequence generation with optional feature conditioning☆31Updated last week
- The Unicode Cookbook for Linguists☆53Updated 3 years ago
- Unicode Standard tokenization routines and orthography profile segmentation☆33Updated 2 years ago
- [LREC 2020] EtymDB, an Etymological DataBase (v2.1)☆21Updated 2 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆24Updated last year
- A character-wise tokenizer for morphologically rich languages☆27Updated 4 months ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆60Updated this week
- An even smaller speech recognizer / force aligner☆32Updated 2 months ago
- An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and so on. Also supporting some non-Uralic languages such as Span…☆70Updated this week
- ⚙️ Powerful JS library to manage audio recording : intelligent cutting, saturation control, various export options...☆34Updated 11 months ago
- ☆30Updated 4 months ago
- German Morphological Analyzer☆47Updated 2 years ago
- ipapy is a Python module to work with International Phonetic Alphabet (IPA) strings☆81Updated 6 months ago
- Morphological Dictionaries for German Language☆28Updated 6 years ago
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…☆15Updated last year