Python data repo, jupyter notebook, python scripts and data.
☆552Dec 10, 2024Updated last year
Alternatives and similar repositories for pythondataanalysis
Users that are interested in pythondataanalysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆16May 29, 2023Updated 2 years ago
- ☆16Jan 16, 2025Updated last year
- An end-to-end data pipeline for building Data Lake and supporting report using Apache Spark.☆16Jan 31, 2023Updated 3 years ago
- ☆11Oct 8, 2021Updated 4 years ago
- This project demonstrates how to build and automate an ETL pipeline written in Python and schedule it using open source Apache Airflow or…☆20Aug 21, 2025Updated 7 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Distributed Data Systems with Azure Databricks, published by Packt☆12Jan 18, 2023Updated 3 years ago
- trino + hive + minio with postgres in docker compose☆27Aug 18, 2023Updated 2 years ago
- ☆16Mar 9, 2026Updated 3 weeks ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆19Apr 25, 2024Updated last year
- ☆26Sep 28, 2023Updated 2 years ago
- code snippet for analytics sessions☆34May 17, 2022Updated 3 years ago
- ☆16Mar 12, 2025Updated last year
- ☆22Feb 5, 2024Updated 2 years ago
- Get introduced to Directed Acyclic Graphs (DAGs) through Dagster with a simple ML program☆13Apr 19, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- A variation on a standard Decision Tree such as that in sklearn, where nodes may be based on an aggregation of multiple splits.☆10May 24, 2024Updated last year
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆145Jul 27, 2023Updated 2 years ago
- Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Jo…☆39,324Mar 19, 2026Updated last week
- ☆25Nov 6, 2024Updated last year
- Acquiring and processing information on world's largest banks☆18Jun 17, 2025Updated 9 months ago
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆24Apr 2, 2022Updated 3 years ago
- ☆16Jan 8, 2023Updated 3 years ago
- ☆14Sep 22, 2022Updated 3 years ago
- ETL pipeline using pyspark (Spark - Python)☆117Apr 4, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆10Aug 6, 2024Updated last year
- ☆14May 14, 2024Updated last year
- An example of a Dagster project with a possible folder structure to organize the assets, jobs, repositories, schedules, and ops. Also has…☆102Nov 3, 2024Updated last year
- ☆47Feb 23, 2021Updated 5 years ago
- ☆146Jan 31, 2023Updated 3 years ago
- Lightweight Python wrapper around the DuckDB extension, httpserver (extension developed by @quackscience)☆17Sep 24, 2025Updated 6 months ago
- Using Plotly to create a heatmap visualization of monthly and hourly data☆13Aug 9, 2021Updated 4 years ago
- ☆67Sep 24, 2025Updated 6 months ago
- Building a Data Pipeline with an Open Source Stack☆58Jun 27, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆132Mar 16, 2026Updated last week
- Repositório com as demonstrações e dados compartilhadas durante os webinars do Databricks Journey Brasil☆19Jul 13, 2022Updated 3 years ago
- PyRapidML is an open source Python library which not only helps in automating Machine Learning Workflows but also helps in building end t…☆14Aug 7, 2021Updated 4 years ago
- Sample project to demonstrate data engineering best practices☆212Feb 24, 2024Updated 2 years ago
- New Generation Opensource Data Stack Demo☆456Feb 6, 2023Updated 3 years ago
- ☆335Aug 13, 2024Updated last year
- ☆46Jul 6, 2024Updated last year