olastor / german-word-frequenciesLinks
Simple word to frequency mappings for the german language based on text corpora and using CISTEM stemmer.
☆13Updated 4 years ago
Alternatives and similar repositories for german-word-frequencies
Users that are interested in german-word-frequencies are comparing it to the libraries listed below
Sorting:
- JavaScript port of SymSpell for Node.js☆13Updated 2 years ago
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiency☆168Updated 2 months ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆31Updated 5 months ago
- Aksharamukha Python Library☆51Updated 6 months ago
- 🎀 JavaScript API for spaCy with Python REST API☆196Updated last year
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)☆30Updated last month
- NLP Functions for amplifying negations, managing elisions, creating ngrams, stems, phonetic codes to tokens and more.☆131Updated last year
- Distance/Similarity functions for Bag of Words, Strings, Vectors and more.☆24Updated 2 years ago
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆14Updated 2 years ago
- 🖋 Resource and Tool for Writing System Identification -- LREC 2024☆19Updated last year
- All languages stopwords collection☆451Updated last year
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆107Updated 2 months ago
- German part-of-speech dictionary☆45Updated last year
- A Directory of Online Newspaper Sources for 70+ Languages☆32Updated 4 years ago
- German Morphological Analyzer☆46Updated 3 years ago
- Morphological Dictionaries for German Language☆29Updated 7 years ago
- Plan and train German transformer models.☆23Updated 4 years ago
- Tools for scraping, annotating, and parsing morphological information from Wiktionary☆15Updated 5 years ago
- The Data Format for Digital Linguistics (DaFoDiL)☆22Updated 2 years ago
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- Norwegian Speech Transformer Models☆19Updated 9 months ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆147Updated 8 months ago
- Code for the paper: Wikinflection: Massive semi-supervised generation of multilingual inflectional corpus from Wiktionary (Metheniti and …☆9Updated 5 years ago
- 🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec☆60Updated 3 years ago
- A list of vocabulary lists☆22Updated 5 years ago
- Inline annotation for the web in pure Javascript. Select text, images, or (nearly) anything else, and add your notes.☆10Updated 9 years ago
- Hyperaudio Lite - a Super-lightweight Interactive Transcript Player☆150Updated 8 months ago
- German stopwords collection☆86Updated 2 years ago
- 🕸 GlotWeb: Web Indexing for Low-Resource Languages -- under construction.☆14Updated 4 months ago
- Open Source AI Benchmarking toolkit for benchmarking speech to text services☆56Updated last year