Delta-Lake, ETL, Spark, Airflow
☆48Oct 9, 2022Updated 3 years ago
Alternatives and similar repositories for AcidOnSpark-ETL
Users that are interested in AcidOnSpark-ETL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Project with Airflow + Spark + MinIO + Postgres + Python3.8☆28Sep 9, 2022Updated 3 years ago
- ☆14Oct 10, 2025Updated 5 months ago
- ☆13May 11, 2025Updated 10 months ago
- ☆41Jan 24, 2023Updated 3 years ago
- ☆22Feb 5, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Agent models implemented with Pyro☆11Jul 11, 2023Updated 2 years ago
- ☆270Oct 23, 2024Updated last year
- Spark and Hive docker containers sharing a common MySQL metastore☆26Apr 17, 2020Updated 5 years ago
- Bu repo 3-5 gün süreyle düzenlenen Python ile Makine Öğrenmesi Eğitimleri için oluşturulmuştur.☆20Oct 9, 2020Updated 5 years ago
- a simple lakeFS webhook for pre-commit and pre-merge validation of data objects☆12Nov 9, 2023Updated 2 years ago
- A Python client for the Enigma API.☆14Dec 7, 2022Updated 3 years ago
- A demo instance of mage for pulling sample data from a public Google pub/sub topic and transforming with dbt.☆12Jan 5, 2024Updated 2 years ago
- ☆41Jul 4, 2022Updated 3 years ago
- Build a data pipeline with Apache Airflow☆11May 7, 2021Updated 4 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- The elegance of Airflow + the power of AWS☆51Feb 5, 2024Updated 2 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆29Aug 8, 2020Updated 5 years ago
- Google Cloud Platform solution that provides an event driven process that flattens (unnests) Google Analytics 360 data that has been expo…☆16Sep 9, 2021Updated 4 years ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆102Jan 31, 2023Updated 3 years ago
- Amazon EMR Serverless and Amazon MSK Serverless Demo☆13Jul 31, 2022Updated 3 years ago
- ☆10Jul 21, 2022Updated 3 years ago
- Collection of dockerized ETL jobs managed by data engineering.☆21Updated this week
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆27Mar 25, 2024Updated 2 years ago
- ☆56Aug 14, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- RedditR for Content Engagement and Recommendation☆18Dec 21, 2017Updated 8 years ago
- A CALDERA plugin☆18Jul 28, 2020Updated 5 years ago
- code for writing twitter bots in several languages☆13Dec 31, 2015Updated 10 years ago
- ☆24Jul 24, 2024Updated last year
- Nyc_Taxi_Data_Pipeline - DE Project☆139Oct 21, 2024Updated last year
- Documentation of Hologres☆13Aug 18, 2020Updated 5 years ago
- Wrapper for SurveyGizmo's restful API service☆16Sep 24, 2020Updated 5 years ago
- Ansible playbooks for Apache Spark on kube☆27Jul 20, 2017Updated 8 years ago
- A minimal docker compose setup for experimenting with cloud agnostic Lakehouse Architectures Apache Spark with Hive Metastore + Delta Lak…☆34Apr 17, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Docker with Airflow and Spark standalone cluster☆263Aug 5, 2023Updated 2 years ago
- A two part tutorial for Ray Core APIs and Ray Serve for Model Deployment☆21Jun 9, 2022Updated 3 years ago
- Improving predictions of Bayesian neural nets via local linearization, AISTATS 2021☆15Dec 30, 2022Updated 3 years ago
- High Performance Go Driver for Bytehouse☆14Jun 11, 2025Updated 9 months ago
- This repository is no longer maintained.☆15Mar 10, 2022Updated 4 years ago
- ☆17Jun 8, 2025Updated 9 months ago
- universal-datalakehouse-postgres-ingestion-deltastreamer☆11Apr 7, 2024Updated last year