adbar / simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
☆144Updated this week
Related projects ⓘ
Alternatives and complementary repositories for simplemma
- Text tokenization and sentence segmentation (segtok v2)☆203Updated 2 years ago
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆230Updated 2 years ago
- Fuzzy matching and more functionality for spaCy.☆252Updated 4 months ago
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆112Updated 6 months ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆409Updated last month
- OpusFilter - Parallel corpus processing toolkit☆102Updated 3 months ago
- 📂 Additional lookup tables and data resources for spaCy☆98Updated last year
- Cython wrapper on Hunspell Dictionary☆65Updated 4 months ago
- German Morphological Analyzer☆47Updated 3 years ago
- coFR: COreference resolution tool for FRench (and singletons).☆24Updated 4 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆76Updated 4 months ago
- 🧪 Cutting-edge experimental spaCy components and features☆95Updated 6 months ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆102Updated 2 weeks ago
- UIMA CAS processing library written in Python☆85Updated 6 months ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆191Updated last year
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆61Updated this week
- Implementation of the ClausIE information extraction system for python+spacy☆220Updated 2 years ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 5 months ago
- Python package for WikiMedia dump processing (Wiktionary, Wikipedia etc). Wikitext parsing, template expansion, Lua module execution. Fo…☆94Updated this week
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality …☆106Updated 8 months ago
- spaCy + UDPipe☆161Updated 2 years ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆149Updated last year
- A spaCy wrapper for DBpedia Spotlight☆105Updated last year
- A tokenizer and sentence splitter for German and English web and social media texts.☆135Updated 3 months ago
- An NLP pipeline for Hebrew☆34Updated 7 months ago
- A python module for English lemmatization and inflection.☆261Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆153Updated 2 years ago
- LASER multilingual sentence embeddings as a pip package☆225Updated last year
- Python framework for processing Universal Dependencies data☆57Updated this week
- Polish morphological tagger.☆43Updated last year