Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
☆193Jun 6, 2025Updated 10 months ago
Alternatives and similar repositories for simplemma
Users that are interested in simplemma are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A python module and REST API for automatic extraction of metadata from PDF files☆18Nov 11, 2024Updated last year
- Yet Another Z39.50-powered Chatbot☆13Oct 9, 2023Updated 2 years ago
- Scripts and Instructions for training and synthesising artificial voices☆12Mar 27, 2024Updated 2 years ago
- Morphological analyzer / inflection engine for Russian and Ukrainian languages. Fork of https://github.com/pymorphy2/pymorphy2☆11Dec 1, 2025Updated 5 months ago
- ✂️ Sentence segmentation with wtpsplit's state-of-the-art Segment any Text (SaT) models☆38Oct 1, 2025Updated 7 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- The most accurate natural language detection library for Python, suitable for short text and mixed-language text☆1,706Apr 23, 2026Updated last week
- Machine-readable lists of lemma-token pairs in 23 languages.☆362Jan 29, 2022Updated 4 years ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆148Nov 4, 2025Updated 5 months ago
- ANYKS Spell-Checker☆19Jan 3, 2023Updated 3 years ago
- A list of awesome open source projects in the machine learning field, who's developers are mainly based in Germany☆55Mar 16, 2026Updated last month
- Compare accuracies of udpipe models and spacy models which can be used for NLP annotation☆14Feb 11, 2018Updated 8 years ago
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆12Dec 15, 2023Updated 2 years ago
- ☆37Mar 16, 2026Updated last month
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆20Apr 24, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 📂 Additional lookup tables and data resources for spaCy☆115Jun 4, 2025Updated 10 months ago
- Python port for IWNLP.Lemmatizer☆19Apr 13, 2026Updated 2 weeks ago
- 🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.☆914Aug 20, 2024Updated last year
- Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.☆258Apr 15, 2026Updated 2 weeks ago
- RUSSE: Russian Semantic Evaluation.☆16Mar 1, 2022Updated 4 years ago
- Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters☆169Dec 19, 2025Updated 4 months ago
- Datasets for the task of tracing diachronic semantic shifts in Russian for two large-scale time period pairs (from pre-Soviet to Soviet t…☆14Feb 21, 2025Updated last year
- Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XM…☆5,807Sep 12, 2025Updated 7 months ago
- 📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF☆39Mar 8, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- JavaScript port of SymSpell for Node.js☆13Sep 30, 2022Updated 3 years ago
- ANNotation Infrastructure using Finna: an automatic subject indexing tool using Finna as corpus☆15Oct 22, 2018Updated 7 years ago
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆76Apr 1, 2026Updated last month
- Basis of FragDenStaat.de's „Koalitionstracker“☆15Jul 14, 2025Updated 9 months ago
- German lemmatization with IWNLP as extension for spaCy☆27Apr 13, 2026Updated 2 weeks ago
- Fuzzy matching and more functionality for spaCy.☆258Jul 6, 2024Updated last year
- A python module for English lemmatization and inflection.☆278Sep 14, 2023Updated 2 years ago
- Next-generation Punkt sentence boundary detection with zero dependencies☆30Nov 18, 2025Updated 5 months ago
- Ad-hoc light weight SPARQL endpoint from a file, using Python Flask and RDFlib☆15Oct 24, 2016Updated 9 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- RaKUn 2.0 - A fast keyword detection algorithm☆72Aug 5, 2025Updated 8 months ago
- SpaCy official Russian model proposal☆32Jan 24, 2021Updated 5 years ago
- Simple summarize ML model☆16Dec 21, 2018Updated 7 years ago
- A reddit bot that finds original publish dates on linked articles.☆10Nov 30, 2024Updated last year
- ☯️ AllenNLP training configurations for promising models on Named Entity Recognition. (BiLSTM-CRF, BiLSTM-CNN-CRF, BERT, BERT-CRF)☆15Nov 26, 2020Updated 5 years ago
- Preliminary spaCy models for Latin☆14Oct 20, 2022Updated 3 years ago
- 🇮🇹 Italian BERT and ELECTRA models (incl. evaluation)☆18Oct 20, 2022Updated 3 years ago