nkthiebaut / zeugma
πNatural language processing (NLP) utils: word embeddings (Word2Vec, GloVe, FastText, ...) and preprocessing transformers, compatible with scikit-learn Pipelines. π
β62Updated last year
Alternatives and similar repositories for zeugma:
Users that are interested in zeugma are comparing it to the libraries listed below
- Intelligently expand and create contractions in text leveraging grammar checking and Word Mover's Distance.β75Updated 3 years ago
- Sentence transformers models for SpaCyβ107Updated last year
- Package that returns a company embedding given a company nameβ45Updated 4 years ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doβ¦β80Updated 8 months ago
- Python library for Natural Language Preprocessing (NLPre)β190Updated last year
- Exploring the simple sentence similarity measurements using word embeddingsβ101Updated 6 months ago
- Anonymization of legal cases (Fr) based on Flair embeddingsβ88Updated 4 years ago
- Named entity recognizer based on ELMo or BERT as feature extractor and CRF as final classifierβ80Updated last year
- A monolingual and cross-lingual meta-embedding generation and evaluation frameworkβ80Updated 2 years ago
- spaCy + UDPipeβ161Updated 2 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Updated last year
- Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasksβ157Updated 2 years ago
- Language detection extension for spaCy 2.0+β112Updated 6 years ago
- Text tokenization and sentence segmentation (segtok v2)β201Updated 2 years ago
- Tutorial for Topic Modelling using PySpark and Spark NLPβ17Updated 4 years ago
- Word Embeddings for Information Retrievalβ225Updated last year
- PYthon Automated Term Extractionβ310Updated 2 years ago
- spaCy pipeline object for negating concepts in textβ279Updated 8 months ago
- SImple SenTence EmbeddeRβ74Updated 2 years ago
- A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.β84Updated 7 months ago
- Dataframe Integration with spaCy.β103Updated 3 years ago
- An implementation of a full named-entity evaluation metrics based on SemEval'13 Task 9 - not at tag/token level but considering all the tβ¦β219Updated 8 months ago
- Creating class-based TF-IDF matricesβ82Updated 2 years ago
- Language Models for Zalando's flair libraryβ61Updated 5 years ago
- Applying BERT to named entity recognition in English and Russian.β162Updated 2 years ago
- Regular spotlights of underrated NLP and Data Science GitHub repositoriesβ35Updated 4 years ago
- Template for AC297r projectsβ33Updated 5 years ago
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.β71Updated 2 years ago
- Google USE (Universal Sentence Encoder) for spaCyβ182Updated last year
- spaCy match and replace, maintaining conjugationβ35Updated 2 years ago