code-kern-ai / refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
☆1,406Updated last month
Alternatives and similar repositories for refinery:
Users that are interested in refinery are comparing it to the libraries listed below
- Open-source natural language enrichments at your fingertips.☆455Updated 2 weeks ago
- An easy way to extract information from documents☆1,732Updated last year
- Blazing fast framework for fine-tuning similarity learning models☆648Updated 3 weeks ago
- skweak: A software toolkit for weak supervision applied to NLP tasks☆923Updated 4 months ago
- The simplest way to serve AI/ML models in production☆944Updated this week
- What's in your data? Extract schema, statistics and entities from datasets☆1,454Updated 2 weeks ago
- 🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞☆718Updated last year
- Represent, send, store and search multimodal data☆3,007Updated 2 months ago
- Doubt your data, find bad labels.☆508Updated 6 months ago
- 🦘 Explore multimedia datasets at scale☆1,053Updated last month
- The Virtual Feature Store. Turn your existing data infrastructure into a feature store.☆1,828Updated this week
- Build data pipelines, the easy way 🛠️☆4,104Updated last year
- With sequence-learn, you can build models for named entity recognition as quickly as if you were building a sklearn classifier.☆22Updated 2 years ago
- The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️☆3,538Updated 4 months ago
- A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagem…☆2,104Updated last week
- An open-source ML pipeline development platform☆978Updated 2 weeks ago
- Build and share data reports in 100% Python☆1,388Updated last year
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,225Updated this week
- dstack is a lightweight, open-source alternative to Kubernetes & Slurm, simplifying AI container orchestration with multi-cloud & on-prem…☆1,655Updated this week
- Fast model deployment on any cloud 🚀☆175Updated 11 months ago
- Gain clues from clustering!☆311Updated 6 months ago
- 1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.☆888Updated this week
- Curated list of open source tooling for data-centric AI on unstructured data.☆705Updated last year
- Zero and Few shot named entity & relationships recognition☆357Updated 2 months ago
- Open Source Data Annotation & Labeling Tools☆539Updated 3 weeks ago
- ZenML 🙏: The bridge between ML and Ops. https://zenml.io.☆4,361Updated this week
- Scalable identity resolution, entity resolution, data mastering and deduplication using ML☆975Updated this week
- ☁️ Terraform plugin for machine learning workloads: spot instance recovery & auto-termination | AWS, GCP, Azure, Kubernetes☆292Updated last month
- SpikeX - SpaCy Pipes for Knowledge Extraction☆397Updated 3 years ago
- Natural Intelligence is still a pretty good idea.☆801Updated 6 months ago