adbar / simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
☆151Updated last month
Alternatives and similar repositories for simplemma:
Users that are interested in simplemma are comparing it to the libraries listed below
- Text tokenization and sentence segmentation (segtok v2)☆203Updated 2 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.☆137Updated last month
- Text to sentence splitter using heuristic algorithm by Philipp Koehn and Josh Schroeder.☆234Updated 2 years ago
- 🧪 Cutting-edge experimental spaCy components and features☆96Updated 8 months ago
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆66Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆156Updated 2 years ago
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆112Updated 8 months ago
- Python3 bindings for the Compact Language Detector v3 (CLD3)☆149Updated last year
- A modern, interlingual wordnet interface for Python☆229Updated last month
- coFR: COreference resolution tool for FRench (and singletons).☆24Updated 4 years ago
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…☆211Updated last month
- Multilingual sentence alignment using sentence embeddings☆106Updated 2 months ago
- Fuzzy matching and more functionality for spaCy.☆255Updated 6 months ago
- A python module for English lemmatization and inflection.☆265Updated last year
- German Morphological Analyzer☆47Updated 3 years ago
- 📂 Additional lookup tables and data resources for spaCy☆99Updated last year
- ☆167Updated 7 months ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆119Updated 8 months ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further lang…☆193Updated 2 years ago
- A python module for word inflections designed for use with spaCy.☆92Updated 4 years ago
- Extracts parallel corpora from the 2 raw texts in different languages.☆35Updated 2 years ago
- Python Finite-State Toolkit☆47Updated last week
- Sentence aligner☆109Updated 3 years ago
- OpusFilter - Parallel corpus processing toolkit☆104Updated this week
- spaCy + UDPipe☆161Updated 2 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do…☆78Updated 6 months ago
- Information extraction from English and German texts based on predicate logic☆135Updated last year
- Augmenty is an augmentation library based on spaCy for augmenting texts.☆151Updated 7 months ago
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆117Updated 9 months ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interface☆253Updated 4 months ago