realpython / data-version-control
☆29Updated last year
Alternatives and similar repositories for data-version-control
Users that are interested in data-version-control are comparing it to the libraries listed below
Sorting:
- Rock Solid Python with Type Hints Course Student Materials☆24Updated 10 months ago
- bamboolib - template for creating your own binder notebook☆21Updated 3 years ago
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- This project is created to promote and advocate the use of FOSS machine learning.☆45Updated 2 weeks ago
- Cookiecutter template for a Python microservice.☆55Updated last year
- ☀️🦶 A lightweight framework for collaborative, open-source feature engineering☆33Updated 3 years ago
- Full stack data engineering tools and infrastructure set-up☆53Updated 4 years ago
- Python implementation of Age-Partitioned Bloom Filter with S3 periodic backup support.☆11Updated 3 months ago
- Feature flags for python.☆18Updated 7 years ago
- Using the Parquet file format with Python☆15Updated last year
- Material for Talk Python Training course on Getting Started with Dask.☆28Updated 2 years ago
- A simple and streamlined Python script to extract and filter links from a remote HTML resource.☆24Updated 4 months ago
- Example project for building scalable data pipelines with Kedro and Ibis.☆13Updated last year
- Supporting content (slides and exercises) for the Pearson video series covering best practices for developing scalable applications with …☆49Updated 4 months ago
- ☆26Updated 3 years ago
- Pandas Training © MetaSnake 2022, CC BY-NC☆18Updated 3 years ago
- A small Python module containing quick utility functions for standard ETL processes.☆35Updated 3 weeks ago
- Simple samples for writing ETL transform scripts in Python☆22Updated 3 years ago
- Low-code Python library enabling access to APIs, tools, data sources in seconds.☆59Updated 9 months ago
- This repository contains code to build an MVP search engine with google like interface.☆15Updated 4 years ago
- Official Python client SDK for Iggy.rs message streaming.☆25Updated 2 months ago
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browser☆33Updated 2 years ago
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆32Updated 3 years ago
- I saw this [Blog Post](https://www.morling.dev/blog/one-billion-row-challenge/) on a Billion Row challenge for Java so naturally I tried …☆14Updated last year
- Versatile Metrics Collection for Python☆19Updated last year
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Updated 5 months ago
- Code examples for the Introduction to Kubeflow course☆14Updated 4 years ago
- Feature selection for tabular datasets using advanced filter and wrapper methods☆17Updated 2 months ago
- 🛠 Self-hosted, fast, and consistent remote configuration for apps.☆15Updated 2 years ago
- BoilingData JS client (NodeJS and Browsers)☆19Updated 7 months ago