olastor / german-word-frequencies
Simple word to frequency mappings for the german language based on text corpora and using CISTEM stemmer.
☆12Updated 4 years ago
Alternatives and similar repositories for german-word-frequencies:
Users that are interested in german-word-frequencies are comparing it to the libraries listed below
- An NLP pipeline for Hebrew☆37Updated last month
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆30Updated 3 years ago
- Open Source AI Benchmarking toolkit for benchmarking speech to text services☆55Updated last year
- Morphological Dictionaries for German Language☆29Updated 7 years ago
- A library for fetching and reading Tatoeba's weekly exports☆22Updated last year
- 🏆 • 5050 most frequent words in 109 languages☆42Updated 2 years ago
- Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code☆33Updated 2 months ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆48Updated last year
- Audiobook alignment for Indigenous languages☆40Updated last week
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆13Updated last year
- Massively multilingual pronunciation mining☆339Updated this week
- 📈 A forced aligner intended for synchronization of narrated text☆91Updated 2 years ago
- python code for converting among IPA, ARPABET, XSAMPA, Callhome, DISC, TIMIT, plus some lexical tones.☆33Updated last year
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆51Updated 3 months ago
- ☆22Updated 3 years ago
- ☆36Updated 10 months ago
- A set of pipelines for performing experiments on various NLP tasks with a focus on resource-poor/minority languages.☆35Updated this week
- The source of the phonetic transcriptions is Oxford Advanced Learner's Dictionary (3rd ed.), available from the Oxford Text Archive (http…☆24Updated 8 years ago
- Python module for syllabifying English ARPABET transcriptions☆66Updated 6 years ago
- Code for the paper: Wikinflection: Massive semi-supervised generation of multilingual inflectional corpus from Wiktionary (Metheniti and …☆9Updated 4 years ago
- ☆72Updated 3 weeks ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆106Updated 2 months ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆25Updated 2 years ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆44Updated 2 years ago
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…☆17Updated 2 years ago
- A model that predicts the punctuation of English, Italian, French and German texts.☆80Updated 2 years ago
- Automatic Speech Recognition (ASR) - German☆21Updated 5 years ago
- An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and so on. Also supporting some non-Uralic languages such as Span…☆78Updated 4 months ago
- Dictionaries of names, surnames, acronyms and it's extensions, stop-words, etc., which I gathered for different experiments.☆28Updated 8 years ago
- ☆15Updated last year