Data pipelines from re-usable components
☆107Nov 12, 2025Updated 4 months ago
Alternatives and similar repositories for patterns-devkit
Users that are interested in patterns-devkit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Multihreaded 64 bit c++ files for processing numba arrays☆18Apr 23, 2024Updated last year
- A Pub/Sub for Tables based data integration platform, to discover, publish, modify and consume data effortlessly.☆39Feb 25, 2026Updated 3 weeks ago
- Plugin for Intake to read from SQL servers☆15May 29, 2023Updated 2 years ago
- 🌳 A compressed rank/select dictionary exploiting approximate linearity and repetitiveness.☆15Jun 28, 2022Updated 3 years ago
- ☆15Apr 4, 2021Updated 4 years ago
- Curiosity based exploration and playing in RL with Gym Robotics envs.☆12Sep 25, 2018Updated 7 years ago
- Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tupl…☆815Aug 10, 2025Updated 7 months ago
- Data Catalog for Databases and Data Warehouses☆36Jan 15, 2024Updated 2 years ago
- Build your feature store with macros right within your dbt repository☆39Dec 16, 2022Updated 3 years ago
- Performance Tests for string_view, C++17☆14Jun 18, 2019Updated 6 years ago
- An interpreted relational query language that compiles to SQL.☆627Aug 17, 2022Updated 3 years ago
- Numeric and scientific computing on GPUs for Python with a NumPy-like API☆93Sep 1, 2021Updated 4 years ago
- Singer tap for getting CSV and XLS(X) data out of Amazon S3☆12Feb 12, 2025Updated last year
- general-purpose fast, stateless, and deterministic feature extractor written in golang for use in machine learning☆12Mar 17, 2018Updated 8 years ago
- A dbt adapter for TiDB☆15Dec 14, 2023Updated 2 years ago
- Python library for building and sharing dataframe-agnostic, sklearn-style transformers and ml models for data science competitions.☆28Mar 10, 2026Updated last week
- Arc is an opinionated framework for defining data pipelines which are predictable, repeatable and manageable.☆171Feb 10, 2024Updated 2 years ago
- Streaming reactive and dataflow graphs in Python☆460Mar 3, 2026Updated 2 weeks ago
- The open-source Useful SDK. One python decorator in the Useful library allows for full observability of Python functions within an ETL.☆19Jan 11, 2024Updated 2 years ago
- ☆17Feb 7, 2025Updated last year
- A scikit-learn-compatible module for Isolation-based anomaly detection using nearest-neighbor ensembles☆12Aug 30, 2023Updated 2 years ago
- A framework for rapid development of robust data pipelines following a simple design pattern☆27Feb 26, 2024Updated 2 years ago
- Beneath is a serverless real-time data platform ⚡️☆84Feb 18, 2022Updated 4 years ago
- Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool☆14Dec 12, 2025Updated 3 months ago
- Data Mesh Architecture☆85Oct 15, 2025Updated 5 months ago
- An JWT based authentication & authorization service designed for microservices☆14Feb 8, 2024Updated 2 years ago
- KNOTS is an intuitive desktop application built to simplify the configuration of Singer pipelines☆67Jan 20, 2023Updated 3 years ago
- Read fixed width data files with Python 3☆14Mar 16, 2026Updated last week
- Statistical Automated Bot Protection☆35Mar 16, 2026Updated last week
- Centralized whale instance using github actions, sourcing metadata from bigquery-public-data.☆18Jun 15, 2024Updated last year
- Official dbt adapter for Vertica☆28Jun 13, 2025Updated 9 months ago
- Apache Arrow PostgreSQL connector☆62Feb 12, 2024Updated 2 years ago
- C++ asynchronous interface for gRPC based on https://github.com/3rdparty/eventuals.☆20Mar 5, 2022Updated 4 years ago
- A tool for converting FERC filings published in XBRL into SQLite databases☆16Mar 16, 2026Updated last week
- A Table format agnostic data sharing framework☆42Feb 4, 2024Updated 2 years ago
- The Data Integration Library project provides a library of generic components based on a multi-stage architecture for data ingress and eg…☆32Jun 6, 2025Updated 9 months ago
- Smart Automation Tool for building modern Data Lakes and Data Pipelines☆122Updated this week
- Quilt is a data mesh for connecting people with actionable data☆1,358Mar 16, 2026Updated last week
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆302Mar 12, 2026Updated last week