irbigdata / data-dockerfiles
a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.
☆580Updated last year
Alternatives and similar repositories for data-dockerfiles
Users that are interested in data-dockerfiles are comparing it to the libraries listed below
Sorting:
- A comprehensive Spark guide collated from multiple sources that can be referred to learn more about Spark or as an interview refresher.☆671Updated 3 years ago
- The tools and sample needed to learn the Docker☆497Updated last year
- A curated list of awesome DataOps tools☆188Updated 7 months ago
- Run popular commandline tools within docker☆1,264Updated last year
- Accumulated knowledge and experience in the field of Data Engineering☆868Updated 2 years ago
- Compare tables within or across databases☆2,968Updated last year
- Selfhosted tech starter pack for development of new project or startup☆1,239Updated last year
- The Data Explorer gives you fast, safe access to data stored in Cassandra, Dynomite, and Redis.☆433Updated 2 years ago
- Write python locally, execute SQL in your data warehouse☆269Updated 2 years ago
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆488Updated 2 years ago
- 📙 Awesome Data Catalogs and Observability Platforms.☆848Updated last month
- AWS solution architect with terraform modules☆20Updated 2 years ago
- The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-host…☆2,067Updated this week
- Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.☆369Updated last week
- The Open-Source Enterprise Data Platform in a single Portal☆238Updated this week
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆2,089Updated last week
- A curated collection of helpful SQL queries and functions, maintained by Count.☆203Updated 3 years ago
- Auto-generated Diagrams from Airflow DAGs. 🔮 🪄☆342Updated last week
- Sync DAG changes from Git to Airflow☆57Updated last week
- A curated list of open source tools used in analytics platforms and data engineering ecosystem☆329Updated 2 months ago
- Tutorial for setting up a Spark cluster running inside of Docker containers located on different machines☆130Updated 2 years ago
- a collection of resources and blogs about Apache Superset☆82Updated 3 years ago
- re_data - fix data issues before your users & CEO would discover them 😊☆1,563Updated last year
- First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business…☆1,317Updated 3 months ago
- Template for a data contract used in a data mesh.☆472Updated last year
- A curated list of resources about Snowflake☆239Updated last year
- New Generation Opensource Data Stack Demo☆431Updated 2 years ago
- Collection covers kubernetes exercises categorized topics-wise and referred back to the individual Kubernetes certification exams.☆263Updated 7 months ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆246Updated 3 months ago
- Awesome Data Engineering☆17Updated 3 months ago