A python package to create a database on the platform using our moj data warehousing framework
☆21Feb 11, 2026Updated 3 weeks ago
Alternatives and similar repositories for etl_manager
Users that are interested in etl_manager are comparing it to the libraries listed below
Sorting:
- Interactive notebooks containing demonstration code of the splink library☆40Updated this week
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Feb 8, 2023Updated 3 years ago
- Fully unit tested utility functions for data engineering. Python 3 only.☆18Feb 11, 2026Updated 3 weeks ago
- A pyspark lib to validate data quality☆18Nov 11, 2022Updated 3 years ago
- A Scalable Data Cleaning Library for PySpark.☆29Apr 4, 2019Updated 6 years ago
- HDF masterclass materials☆29Mar 28, 2016Updated 9 years ago
- Anonymizing Library for Apache Spark☆31Nov 9, 2023Updated 2 years ago
- Python Package to Share/Edit Pandas/Polars DF with web interface!☆11Jun 10, 2025Updated 8 months ago
- A nlp framework to find hate speech comments out of a comments corpus.☆11Dec 8, 2022Updated 3 years ago
- ☆11Nov 26, 2024Updated last year
- This project aims to load the UK rail timetable and station data provided by the Association of Train Operating Companies (at data.atoc.o…☆10May 28, 2022Updated 3 years ago
- Collect and aggregate on spark events for profitz☆10Apr 22, 2022Updated 3 years ago
- A course about terraform☆11Apr 13, 2021Updated 4 years ago
- Apply GOV.UK styled components and formats in R Shiny☆52Feb 24, 2026Updated last week
- Formatting RMarkdown into govspeak for publishing on gov.uk☆11Aug 28, 2025Updated 6 months ago
- Asynchronous actions for PySpark☆48Dec 2, 2021Updated 4 years ago
- A Python wrapper for Affinity (CRM platform).☆14Jul 12, 2018Updated 7 years ago
- [Deprecated] This solution helps customers reduce operational complexity and enables administrators to quickly create manual, event-based…☆14Mar 8, 2023Updated 3 years ago
- A Configuration System for Airflow☆16Updated this week
- Convenient pyarrow operations following the Pandas API☆45Jan 30, 2022Updated 4 years ago
- An ever growing collection of patterns for re-use☆11Feb 8, 2015Updated 11 years ago
- Associated blog post - https://tristanrhodes.com/blog/Adventures-in-Algorithmic-Trading-on-the-Runescape-Grand-Exchange☆10Oct 14, 2024Updated last year
- An R package that implements fast searching for multiple keywords in multiple texts.☆11Feb 5, 2025Updated last year
- Framework for simpler Spark Pipelines☆11Feb 27, 2026Updated last week
- ☆13Mar 4, 2022Updated 4 years ago
- Futugram - A small web application that allows a user to upload photos and share them with the wider world.☆12Dec 16, 2018Updated 7 years ago
- A collection of python utility functions☆11Feb 11, 2026Updated 3 weeks ago
- similarity between graph nodes based on local information with PySpark☆10Sep 30, 2022Updated 3 years ago
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆10Feb 2, 2024Updated 2 years ago
- ☆11Apr 2, 2021Updated 4 years ago
- Code snippets used for http://thisdataguy.com☆14Oct 13, 2020Updated 5 years ago
- Learning GitLab, published by Packt☆13Jan 18, 2021Updated 5 years ago
- A local docker image with Zeppelin 0.10, AWS Glue v3☆17Jan 18, 2022Updated 4 years ago
- A nicer deparse☆12Jun 22, 2017Updated 8 years ago
- Higher education affordability data☆10Jul 18, 2019Updated 6 years ago
- Terraform ECS module☆15Aug 22, 2022Updated 3 years ago
- An intelligent predictive text entry platform. Mirror of git://git.code.sf.net/p/presage/presage Please send reports to the SourceForge b…☆11Aug 17, 2015Updated 10 years ago
- ☆15Feb 12, 2026Updated 3 weeks ago
- This project is an example of using AWS Step functions to manage and track a series of AWS Batch jobs in N_TO_N mode.☆15Jan 20, 2026Updated last month