koaning/bulk

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/koaning/bulk)

koaning / bulk

A Simple Bulk Labelling Tool

☆598

Alternatives and similar repositories for bulk

Users that are interested in bulk are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

koaning / embetter
View on GitHub
just a bunch of useful embeddings for scikit-learn pipelines
☆527Feb 12, 2026Updated 5 months ago
TutteInstitute / thisnotthat
View on GitHub
A visual labeling system implemented in Jupyter widgets.
☆155Nov 13, 2024Updated last year
koaning / human-learn
View on GitHub
Natural Intelligence is still a pretty good idea.
☆832Mar 9, 2026Updated 4 months ago
koaning / doubtlab
View on GitHub
Doubt your data, find bad labels.
☆515Jul 15, 2024Updated 2 years ago
koaning / cluestar
View on GitHub
Gain clues from clustering!
☆324Jul 16, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
webis-de / small-text
View on GitHub
Active Learning for Text Classification in Python
☆646May 24, 2026Updated last month
davidberenstein1957 / concise-concepts
View on GitHub
This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…
☆244Jun 19, 2023Updated 3 years ago
koaning / spacy-report
View on GitHub
Generate reports for spaCy models.
☆29May 27, 2022Updated 4 years ago
koaning / uvnb
View on GitHub
Have UV deal with all your Jupyter deps.
☆29Sep 7, 2024Updated last year
argilla-io / argilla
View on GitHub
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
☆5,039Jul 13, 2026Updated last week
koaning / simsity
View on GitHub
Super Simple Similarities Service
☆154Apr 11, 2025Updated last year
NorskRegnesentral / skweak
View on GitHub
skweak: A software toolkit for weak supervision applied to NLP tasks
☆925Sep 2, 2024Updated last year
koaning / whatlies
View on GitHub
Toolkit to help understand "what lies" in word embeddings. Also benchmarking!
☆481Feb 6, 2023Updated 3 years ago
MaartenGr / BERTopic
View on GitHub
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
☆7,748May 13, 2026Updated 2 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
MaartenGr / PolyFuzz
View on GitHub
Fuzzy string matching, grouping, and evaluation.
☆800Jul 10, 2025Updated last year
IBM / zshot
View on GitHub
Zero and Few shot named entity & relationships recognition
☆400Sep 17, 2025Updated 10 months ago
tomaarsen / SpanMarkerNER
View on GitHub
SpanMarker for Named Entity Recognition
☆477Apr 10, 2026Updated 3 months ago
explosion / curated-transformers
View on GitHub
🤖 A PyTorch library of curated Transformer models and their composable components
☆892Apr 17, 2024Updated 2 years ago
wjbmattingly / LeetTopic
View on GitHub
☆55Jan 9, 2024Updated 2 years ago
explosion / floret
View on GitHub
🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy
☆343Apr 25, 2025Updated last year
jboynyc / textnets
View on GitHub
Text analysis with networks.
☆294May 14, 2026Updated 2 months ago
koaning / scikit-playtime
View on GitHub
Rethinking machine learning pipelines
☆37Sep 29, 2025Updated 9 months ago
HLasse / TextDescriptives
View on GitHub
A Python library for calculating a large variety of metrics from text
☆366May 5, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
AnswerDotAI / playwrightnb
View on GitHub
Use sync mode Playwright interactively, inside a Jupyter notebook
☆20Updated this week
Lucaterre / spacyfishing
View on GitHub
A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata
☆173Nov 7, 2022Updated 3 years ago
davidberenstein1957 / classy-classification
View on GitHub
This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-s…
☆221Jan 20, 2025Updated last year
koaning / clumper
View on GitHub
A small python library that can clump lists of data together.
☆149Nov 30, 2021Updated 4 years ago
davidberenstein1957 / spacy-setfit
View on GitHub
This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.
☆84Aug 31, 2023Updated 2 years ago
TutteInstitute / datamapplot
View on GitHub
Creating beautiful plots of data maps
☆1,020Updated this week
carbonfact / munpack
View on GitHub
📊 Explain why metrics change by unpacking them
☆42Jan 16, 2026Updated 6 months ago
richardpaulhudson / holmes-extractor
View on GitHub
Information extraction from English and German texts based on predicate logic
☆144Jun 6, 2023Updated 3 years ago
agermanidis / pigeon
View on GitHub
🐦 Quickly annotate data from the comfort of your Jupyter notebook
☆788Apr 4, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
ddangelov / Top2Vec
View on GitHub
Top2Vec learns jointly embedded topic, document and word vectors.
☆3,104Nov 14, 2024Updated last year
erre-quadro / spikex
View on GitHub
SpikeX - SpaCy Pipes for Knowledge Extraction
☆403Jul 30, 2021Updated 4 years ago
koaning / arxiv-frontpage
View on GitHub
My personal frontpage app
☆112Updated this week
cleanlab / cleanlab
View on GitHub
Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data …
☆11,582Jan 13, 2026Updated 6 months ago
TutteInstitute / evoc
View on GitHub
Embedding Vector Oriented Clustering
☆341Jun 2, 2026Updated last month
wilsonjr / humap
View on GitHub
Hierarchical Uniform Manifold Approximation and Projection
☆242Feb 18, 2025Updated last year
booknlp / booknlp
View on GitHub
BookNLP, a natural language processing pipeline for books
☆928Jul 31, 2024Updated last year