kvh / dcp
Universal data copy
☆9Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for dcp
- Data pipelines from re-usable components☆106Updated last year
- Highly concurrent and fast content processing for Mighty Inference Server☆10Updated last year
- Build your feature store with macros right within your dbt repository☆37Updated last year
- Graph Engine for Exploration and Search☆40Updated 9 months ago
- ☆82Updated 6 months ago
- ☆21Updated 2 weeks ago
- ☆19Updated 3 years ago
- Contextual Multi-Armed Bandit Platform for Scoring, Ranking & Decisions☆21Updated last year
- Python binding for DataFusion☆59Updated 2 years ago
- BoilingData JS client (NodeJS and Browsers)☆19Updated last month
- the open-source product analytics tool for the modern data stack☆28Updated 2 years ago
- 🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.☆65Updated this week
- 🛡️ Managed isolated environments for Python☆78Updated last week
- spaCy entry points for Curated Transformers☆24Updated last month
- Arrow, pydantic style☆82Updated last year
- Magniv Core - A Python-decorator based job orchestration platform. Avoid responsibility handoffs by abstracting infra and DevOps.☆77Updated 3 months ago
- A python library bakeoff for medium sized datasets☆24Updated last year
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆25Updated this week
- Sord Data Fabric: A Vue 3 frontend with a Python WebSocket server, leveraging a distributed architecture with DeltaLake and DuckDB worker…☆18Updated 11 months ago
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browser☆33Updated last year
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- Examples showing real-life use cases for fal + dbt☆22Updated 2 years ago
- Examples for using Amazon SageMaker components in Kubeflow Pipelines☆22Updated 4 years ago
- Search for similar short strings☆53Updated 4 years ago
- EmbeDB is a small Python wrapper around LMDB built as key-value storage for embeddings.☆13Updated 2 years ago
- Python library to run ML/data pipelines on stateless compute infrastructure (that may be ephemeral or serverless). Please see the documen…☆17Updated last year
- Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with function…☆91Updated 2 years ago
- Convert JSON files to Apache Parquet.☆46Updated last year
- Efficient BM25 with DuckDB 🦆☆29Updated 3 weeks ago
- A small Python module containing quick utility functions for standard ETL processes.☆33Updated last week