code-kern-ai / refinery
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
☆1,431Updated 3 months ago
Alternatives and similar repositories for refinery:
Users that are interested in refinery are comparing it to the libraries listed below
- Blazing fast framework for fine-tuning similarity learning models☆656Updated 2 months ago
- The simplest way to serve AI/ML models in production☆966Updated this week
- Open-source natural language enrichments at your fingertips.☆456Updated 2 months ago
- An open-source ML pipeline development platform☆985Updated 2 months ago
- 🦘 Explore multimedia datasets at scale☆1,052Updated 3 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,055Updated 6 months ago
- The Virtual Feature Store. Turn your existing data infrastructure into a feature store.☆1,869Updated this week
- An easy way to extract information from documents☆1,743Updated last year
- 🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞☆719Updated last year
- Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.☆733Updated this week
- Curated list of open source tooling for data-centric AI on unstructured data.☆713Updated last year
- A Simple Bulk Labelling Tool☆569Updated 2 months ago
- AI code-writing assistant that understands data content☆2,252Updated last year
- The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️☆3,557Updated 6 months ago
- Visualise your Kedro data and machine-learning pipelines and track your experiments.☆706Updated this week
- A reactive Python kernel for Jupyter notebooks.☆1,202Updated last week
- A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagem…☆2,112Updated 2 months ago
- A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton☆863Updated last year
- Labelling platform for text using weak supervision.☆260Updated 2 years ago
- Build data pipelines, the easy way 🛠️☆4,113Updated last year
- 🤖 A PyTorch library of curated Transformer models and their composable components☆884Updated 11 months ago
- Build and share data reports in 100% Python☆1,393Updated last year
- Natural Intelligence is still a pretty good idea.☆806Updated 8 months ago
- ☁️ Terraform plugin for machine learning workloads: spot instance recovery & auto-termination | AWS, GCP, Azure, Kubernetes☆293Updated 3 months ago
- Fast model deployment on any cloud 🚀☆176Updated last year
- An end-to-end implementation of intent prediction with Metaflow and other cool tools☆858Updated last year
- dstack is a lightweight, open-source alternative to Kubernetes & Slurm, simplifying AI container orchestration with multi-cloud & on-prem…☆1,744Updated this week
- Interactively explore unstructured datasets from your dataframe.☆1,157Updated last month
- The balance python package offers a simple workflow and methods for dealing with biased data samples when looking to infer from them to s…☆696Updated last week
- ☆704Updated 2 years ago