nkthiebaut / zeugma
πNatural language processing (NLP) utils: word embeddings (Word2Vec, GloVe, FastText, ...) and preprocessing transformers, compatible with scikit-learn Pipelines. π
β60Updated last year
Related projects β
Alternatives and complementary repositories for zeugma
- π€ Calculate average word embeddings (word2vec) from documents for transfer learningβ54Updated 5 months ago
- Creating class-based TF-IDF matricesβ82Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Updated 8 months ago
- Intelligently expand and create contractions in text leveraging grammar checking and Word Mover's Distance.β75Updated 2 years ago
- N-gram Extraction Approaches (bigrams, trigrams)β42Updated 6 years ago
- Python library for Natural Language Preprocessing (NLPre)β189Updated last year
- Anonymization of legal cases (Fr) based on Flair embeddingsβ87Updated 3 years ago
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.β69Updated last year
- Template for AC297r projectsβ33Updated 4 years ago
- Sentence transformers models for SpaCyβ105Updated last year
- shabeelkandi / Handling-Out-of-Vocabulary-Words-in-Natural-Language-Processing-using-Language-Modellingβ68Updated 5 years ago
- Repo for my talk at the PyData Berlin 2017 conferenceβ66Updated 7 years ago
- Inter-annotator agreement for Doccanoβ27Updated 4 years ago
- Notebooks configured to be run with Binder, usually found on my blog.β41Updated last year
- Package that returns a company embedding given a company nameβ42Updated 4 years ago
- β54Updated 2 years ago
- spaCy pipeline object for negating concepts in textβ274Updated 4 months ago
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interfaceβ249Updated 2 months ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doβ¦β76Updated 4 months ago
- Regular spotlights of underrated NLP and Data Science GitHub repositoriesβ35Updated 4 years ago
- Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasksβ155Updated last year
- Running Prodigy for a team of annotatorsβ53Updated 3 years ago
- π Additional lookup tables and data resources for spaCyβ98Updated last year
- A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.β82Updated 3 months ago
- Python3 implementation of the Schwartz-Hearst algorithm for extracting abbreviation-definition pairsβ87Updated last year
- A visualisation tool for Spacy using Hierplane.β65Updated last year
- Dataframe Integration with spaCy.β101Updated 3 years ago
- Named Entity Recognition based on dictionariesβ242Updated 5 years ago
- Exploring the simple sentence similarity measurements using word embeddingsβ100Updated 2 months ago
- A embed able annotation tool for end to end cross document co-referenceβ41Updated last year