code-kern-ai / refineryLinks
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
☆1,469Updated last year
Alternatives and similar repositories for refinery
Users that are interested in refinery are comparing it to the libraries listed below
Sorting:
- The simplest way to serve AI/ML models in production☆1,100Updated this week
- The Virtual Feature Store. Turn your existing data infrastructure into a feature store.☆1,960Updated 5 months ago
- 🦘 Explore multimedia datasets at scale☆1,060Updated last year
- Open-source natural language enrichments at your fingertips.☆461Updated 11 months ago
- Blazing fast framework for fine-tuning similarity learning models☆661Updated 2 months ago
- An open-source ML pipeline development platform☆1,001Updated 11 months ago
- An easy way to extract information from documents☆1,784Updated 2 years ago
- Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.☆835Updated last week
- skweak: A software toolkit for weak supervision applied to NLP tasks☆926Updated last year
- What's in your data? Extract schema, statistics and entities from datasets☆1,535Updated 3 months ago
- Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand …☆1,367Updated last month
- 🐶 A tool to package, serve, and deploy any ML model on any platform. Archived to be resurrected one day🤞☆719Updated 2 years ago
- Represent, send, store and search multimodal data☆3,113Updated 6 months ago
- 1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.☆952Updated 11 months ago
- A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagem…☆2,389Updated 3 months ago
- Build and share data reports in 100% Python☆1,399Updated 2 years ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,132Updated this week
- A web-based document annotation tool, powered by GPT-4☆265Updated last year
- Build animated charts in Jupyter Notebook and similar environments with a simple Python syntax.☆968Updated 10 months ago
- A Python vector database you just need - no more, no less.☆632Updated last year
- 🦙 Integrating LLMs into structured NLP pipelines☆1,358Updated 11 months ago
- Labelling platform for text using weak supervision.☆261Updated 3 years ago
- Fuzzy string matching, grouping, and evaluation.☆786Updated 5 months ago
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,798Updated last week
- A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton☆861Updated 2 years ago
- Build data pipelines, the easy way 🛠️☆4,147Updated 2 years ago
- Distribute and run AI workloads on Kubernetes magically in Python, like PyTorch for ML infra.☆1,140Updated this week
- Neural Search☆334Updated last year
- dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or o…☆1,993Updated this week
- Doubt your data, find bad labels.☆517Updated last year