code-kern-ai / refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
☆1,433Updated 4 months ago
Alternatives and similar repositories for refinery:
Users that are interested in refinery are comparing it to the libraries listed below
- Open-source natural language enrichments at your fingertips.☆456Updated 3 months ago
- The Virtual Feature Store. Turn your existing data infrastructure into a feature store.☆1,878Updated this week
- Blazing fast framework for fine-tuning similarity learning models☆657Updated last week
- 🦘 Explore multimedia datasets at scale☆1,055Updated 4 months ago
- The simplest way to serve AI/ML models in production☆974Updated this week
- Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.☆742Updated this week
- An easy way to extract information from documents☆1,748Updated last year
- An open-source ML pipeline development platform☆985Updated 3 months ago
- 🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞☆719Updated last year
- Represent, send, store and search multimodal data☆3,040Updated 3 weeks ago
- OCR, Archive, Index and Search: Implementation agnostic OCR framework.☆223Updated last year
- Fast model deployment on any cloud 🚀☆176Updated last year
- Labelling platform for text using weak supervision.☆262Updated 2 years ago
- Natural language Pandas queries and data generation powered by GPT-3☆198Updated last year
- A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagem…☆2,119Updated 2 months ago
- Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc☆387Updated 9 months ago
- Creative interactive views of any dataset.☆838Updated 3 months ago
- UnionML: the easiest way to build and deploy machine learning microservices☆335Updated last year
- A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton☆861Updated last year
- dstack is an open-source alternative to Kubernetes and Slurm, designed to simplify GPU allocation and AI workload orchestration for ML te…☆1,748Updated this week
- skweak: A software toolkit for weak supervision applied to NLP tasks☆922Updated 7 months ago
- Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand …☆1,261Updated 6 months ago
- Transforms PDF, Documents and Images into Enriched Structured Data☆5,948Updated last year
- A web-based document annotation tool, powered by GPT-4☆258Updated last year
- ML pipeline orchestration and model deployments on Kubernetes.☆435Updated last year
- Build and share data reports in 100% Python☆1,393Updated last year
- Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications,…☆285Updated 2 months ago
- Build data pipelines, the easy way 🛠️☆4,116Updated last year
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,439Updated last week
- Neural Search☆328Updated 10 months ago