Workshop Big Data en Español
☆23Nov 9, 2023Updated 2 years ago
Alternatives and similar repositories for bigdata-workshop-es
Users that are interested in bigdata-workshop-es are comparing it to the libraries listed below
Sorting:
- This is a GitHub for all of my NiFi Templates☆48Aug 27, 2020Updated 5 years ago
- Data encoding library for Haskell.☆12Aug 4, 2023Updated 2 years ago
- code examples of my talk☆11Jun 25, 2019Updated 6 years ago
- ☆11Sep 23, 2019Updated 6 years ago
- 3NF-normalize Yelp data on S3 with Spark and load it into Redshift - automate the whole thing with Apache Airflow☆12Aug 17, 2019Updated 6 years ago
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆18Jun 28, 2021Updated 4 years ago
- A project to develop a fully distributed MapReduce library for Haskell which makes using the MapReduce framework totally transparent for …☆20Nov 12, 2011Updated 14 years ago
- A 4 week program to get started with Data Science. Useful for beginners who want to get started by themselves.☆14Jun 15, 2016Updated 9 years ago
- limit spend on AWS based on a tag, stop compute instances at threshold☆13Apr 13, 2020Updated 5 years ago
- Reference Architecture to automate the use of S3 Express One Zone as a caching layer for S3 Regional Buckets.☆14Apr 14, 2025Updated 10 months ago
- Easy Scheduler是一个分布式工作流任务调度系统,主要解决数据研发ETL错综复杂的依赖关系,而不能直观监控任务健康状态等问题。Easy Scheduler以DAG流式的方式将Task组装起来,可实时监控任务的运行状态,同时支持重试、从指定节点恢复失败、暂停及Kil…☆10Apr 9, 2019Updated 6 years ago
- Connects Campaign Manager to the RTB4FREE bidders☆13Nov 16, 2022Updated 3 years ago
- Slowly Changing Dimension type 2 using Hive query language using exclusive join technique with ORC Hive tables, partitioned and clustered…☆16May 11, 2019Updated 6 years ago
- Project defining the docker image that will support examples of algorithms created in this organization☆13Oct 22, 2017Updated 8 years ago
- ☆15Nov 26, 2020Updated 5 years ago
- Statistical and exploratory Analysis of Cricket Data☆12Oct 19, 2015Updated 10 years ago
- dbt-github-workflow is a boilerplate that contains all the necessary configurations to set up a simple CI/CD pipeline for your data model…☆17Mar 27, 2022Updated 3 years ago
- ☆12Aug 11, 2021Updated 4 years ago
- An ETL tool for converting untyped CSV to parquet. Also triggers data lake updates.☆15Oct 29, 2021Updated 4 years ago
- HTML5 Canvas Integration for Reflex Dom☆13Dec 19, 2019Updated 6 years ago
- OpenRTB v2.5 and OpenRTB Dynamic Native Ads v1.2 types for rust.☆21Feb 2, 2023Updated 3 years ago
- SQS-based Python SDK for streaming data in realtime to the Panoply platform☆17Jun 22, 2025Updated 8 months ago
- Solutions to TopCoder problems, written in Python.☆16Mar 4, 2012Updated 14 years ago
- Data Brewery is an ETL (Extract-Transform-Load) program that connect to many data sources (cloud services, databases, ...) and manage dat…☆16Jan 21, 2021Updated 5 years ago
- real time log event processing using spark, kafka & cassandra☆13Dec 4, 2014Updated 11 years ago
- Demonstrations of DBT☆16Aug 5, 2019Updated 6 years ago
- Run Airflow on Kubernetes. This repository contains scripts to 1) run a multinode kubernets cluster on local machine using KinD, 2) prepa…☆17Apr 12, 2023Updated 2 years ago
- Minikube for big data with Scala and Spark☆15Oct 28, 2019Updated 6 years ago
- Source code for the "Scala For Beginners" book. https://leanpub.com/scalaforbeginners/☆14Oct 14, 2019Updated 6 years ago
- Docker for airflow with mysql as backend☆12Nov 15, 2018Updated 7 years ago
- Generative Art Experiments using Haskell, GHCJS, and Reflex (FRP)☆18Mar 16, 2019Updated 6 years ago
- A table-type dbt materialization for Snowflake to enable Time Travel☆22Jan 12, 2026Updated last month
- Helper package for working with SVG in Reflex☆21Mar 24, 2023Updated 2 years ago
- ☆18Nov 9, 2025Updated 4 months ago
- Infraestructura para Big Data : Hadoop + NiFi +Spark + Hive usando Docker☆20Jan 5, 2026Updated 2 months ago
- Any Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested da…☆114Sep 21, 2023Updated 2 years ago
- DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics control framework that can be used to monitor, log, aud…☆29Feb 14, 2026Updated 3 weeks ago
- Finance 🏦 Data Builder 🛠️ @ postgres 🐘☆22Feb 11, 2021Updated 5 years ago
- Spark Streaming HBase Example☆22Mar 16, 2016Updated 9 years ago