πΈ fastText + Bloom embeddings for compact, full-coverage vectors with spaCy
β335Apr 25, 2025Updated 10 months ago
Alternatives and similar repositories for floret
Users that are interested in floret are comparing it to the libraries listed below
Sorting:
- skweak: A software toolkit for weak supervision applied to NLP tasksβ926Sep 2, 2024Updated last year
- Active Learning for Text Classification in Pythonβ639Feb 1, 2026Updated 3 weeks ago
- π§ͺ Cutting-edge experimental spaCy components and featuresβ105Apr 23, 2024Updated last year
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entiβ¦β244Jun 19, 2023Updated 2 years ago
- just a bunch of useful embeddings for scikit-learn pipelinesβ522Feb 12, 2026Updated 2 weeks ago
- SpikeX - SpaCy Pipes for Knowledge Extractionβ403Jul 30, 2021Updated 4 years ago
- This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-sβ¦β220Jan 20, 2025Updated last year
- πΈ Use pretrained transformers like BERT, XLNet and GPT-2 in spaCyβ1,402Nov 7, 2025Updated 3 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.β157May 24, 2024Updated last year
- β68Mar 17, 2022Updated 3 years ago
- Doubt your data, find bad labels.β517Jul 15, 2024Updated last year
- Toolkit to help understand "what lies" in word embeddings. Also benchmarking!β476Feb 6, 2023Updated 3 years ago
- A library to synthesize text datasets using Large Language Models (LLM)β152Jan 17, 2023Updated 3 years ago
- Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further langβ¦β199Dec 18, 2022Updated 3 years ago
- Self-Supervision for Named Entity Disambiguation at the Tailβ218Jun 14, 2022Updated 3 years ago
- π οΈ Tools for Transformers compression using PyTorch Lightning β‘β85Feb 1, 2026Updated 3 weeks ago
- π¦ Integrating LLMs into structured NLP pipelinesβ1,364Jan 8, 2025Updated last year
- Fuzzy matching and more functionality for spaCy.β259Jul 6, 2024Updated last year
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidataβ169Nov 7, 2022Updated 3 years ago
- βοΈContextual word checker for better suggestions (not actively maintained)β418Jan 31, 2025Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.β120Oct 20, 2025Updated 4 months ago
- Efficient few-shot learning with Sentence Transformersβ2,683Dec 11, 2025Updated 2 months ago
- π spaCy building blocks and visualizers for Streamlit appsβ853Jul 29, 2024Updated last year
- π¦ Contextually-keyed word vectorsβ1,673Apr 23, 2025Updated 10 months ago
- Pipeline components that support partial_fit.β46Jul 15, 2024Updated last year
- Information extraction from English and German texts based on predicate logicβ141Jun 6, 2023Updated 2 years ago
- Dataframe Integration with spaCy.β103Mar 12, 2021Updated 4 years ago
- Leveraging BERT and c-TF-IDF to create easily interpretable topics.β7,412Feb 20, 2026Updated last week
- Tools for shrinking fastText models (in gensim format)β183May 3, 2024Updated last year
- Quote extraction for modular journalism (JournalismAI collab 2021)β229Feb 2, 2022Updated 4 years ago
- spaCy pipeline object for negating concepts in textβ282Jun 16, 2025Updated 8 months ago
- π Make Thinc faster on macOS by calling into Apple's native Accelerate libraryβ102Jun 30, 2025Updated 7 months ago
- β30Jun 23, 2022Updated 3 years ago
- Recon NER, Debug and correct annotated Named Entity Recognition (NER) data for inconsistencies and get insights on improving the quality β¦β106Feb 26, 2024Updated 2 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.β21Feb 7, 2023Updated 3 years ago
- Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processingβ789Jul 22, 2025Updated 7 months ago
- Top2Vec learns jointly embedded topic, document and word vectors.β3,106Nov 14, 2024Updated last year
- A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherβ¦β1,265Jul 24, 2025Updated 7 months ago
- A Python library for calculating a large variety of metrics from textβ360Jan 30, 2026Updated 3 weeks ago