olastor / german-word-frequencies
Simple word to frequency mappings for the german language based on text corpora and using CISTEM stemmer.
☆11Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for german-word-frequencies
- Morphological Dictionaries for German Language☆28Updated 6 years ago
- Python interface to ISLEX, an English IPA pronunciation dictionary with syllable and stress marking.☆47Updated 10 months ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆43Updated last year
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆27Updated 3 years ago
- A model that predicts the punctuation of English, Italian, French and German texts.☆74Updated last year
- A character-wise tokenizer for morphologically rich languages☆27Updated 5 months ago
- Unicode Standard tokenization routines and orthography profile segmentation☆33Updated 2 years ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆94Updated this week
- A list of vocabulary lists☆21Updated 4 years ago
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆22Updated 7 years ago
- An NLP pipeline for Hebrew☆34Updated 7 months ago
- An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and so on. Also supporting some non-Uralic languages such as Span…☆70Updated this week
- A comprehensive list of Arabic NLP resources.☆13Updated 3 weeks ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆28Updated last year
- A library for fetching and reading Tatoeba's weekly exports☆20Updated 11 months ago
- ☆67Updated 3 months ago
- The Data Format for Digital Linguistics (DaFoDiL)☆22Updated last year
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆36Updated last year
- Creates interlinearized versions of books (EPUB, MOBI, etc), adding "subtitles" with translations under each word in the text.☆22Updated 4 years ago
- ☆30Updated 5 months ago
- Massively multilingual pronunciation mining☆321Updated this week
- ☆15Updated last year
- python code for converting among IPA, ARPABET, XSAMPA, Callhome, DISC, TIMIT, plus some lexical tones.☆30Updated 9 months ago
- Script for workflow to add morphological analysis into ELAN files☆13Updated 4 years ago
- 📈 A forced aligner intended for synchronization of narrated text☆85Updated last year
- Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.☆17Updated last year
- Audiobook alignment for Indigenous languages☆38Updated this week
- These are lists for a variety of languages containing words that are distinctive to each language.☆34Updated 2 years ago
- Offline bilingual dictionaries made using data from Wiktionary☆52Updated 9 years ago
- Gentle and praatio scripts for easy forced alignment☆18Updated 2 years ago