code-kern-ai / refineryLinks
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
☆1,438Updated 5 months ago
Alternatives and similar repositories for refinery
Users that are interested in refinery are comparing it to the libraries listed below
Sorting:
- Open-source natural language enrichments at your fingertips.☆459Updated 4 months ago
- Blazing fast framework for fine-tuning similarity learning models☆656Updated last month
- The Virtual Feature Store. Turn your existing data infrastructure into a feature store.☆1,900Updated 2 weeks ago
- The simplest way to serve AI/ML models in production☆992Updated this week
- The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️☆3,583Updated this week
- What's in your data? Extract schema, statistics and entities from datasets☆1,492Updated 2 months ago
- Build and share data reports in 100% Python☆1,390Updated last year
- An easy way to extract information from documents☆1,754Updated 2 years ago
- A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton☆861Updated last year
- A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagem…☆2,135Updated 4 months ago
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,509Updated this week
- Backend that powers the dataset viewer on Hugging Face dataset pages through a public API.☆754Updated this week
- Doubt your data, find bad labels.☆513Updated 10 months ago
- Build data pipelines, the easy way 🛠️☆4,124Updated last year
- Gain clues from clustering!☆313Updated 10 months ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,082Updated 2 months ago
- 🦘 Explore multimedia datasets at scale☆1,057Updated 5 months ago
- Natural Intelligence is still a pretty good idea.☆813Updated 10 months ago
- AI code-writing assistant that understands data content☆2,264Updated last year
- Malloy is an experimental language for describing data relationships and transformations.☆2,158Updated this week
- Visualise your Kedro data and machine-learning pipelines and track your experiments.☆716Updated this week
- Efficient few-shot learning with Sentence Transformers☆2,486Updated last month
- skweak: A software toolkit for weak supervision applied to NLP tasks☆925Updated 9 months ago
- Fast model deployment on any cloud 🚀☆176Updated last year
- Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.☆892Updated this week
- Interactively explore unstructured datasets from your dataframe.☆1,175Updated 2 weeks ago
- With sequence-learn, you can build models for named entity recognition as quickly as if you were building a sklearn classifier.☆22Updated 2 years ago
- 🧹 Python package for text cleaning☆979Updated 2 years ago
- Scalable identity resolution, entity resolution, data mastering and deduplication using ML☆1,032Updated this week
- An open-source ML pipeline development platform☆988Updated 4 months ago