☆24Dec 4, 2023Updated 2 years ago
Alternatives and similar repositories for data-engineering-test
Users that are interested in data-engineering-test are comparing it to the libraries listed below
Sorting:
- This is an ETL application on AWS with general open sales and customer data that you can find here: https://github.com/camposvinicius/dat…☆18Feb 7, 2022Updated 4 years ago
- This repo provides the Kubernetes Helm chart for deploying Pyspark Notebook.☆17Nov 16, 2022Updated 3 years ago
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28May 19, 2025Updated 9 months ago
- Learning project☆13Feb 4, 2023Updated 3 years ago
- ☆32Aug 18, 2021Updated 4 years ago
- Spark on Kubernetes infrastructure Docker images repo☆37Oct 20, 2022Updated 3 years ago
- REDCap Electronic Data - I (Ingester/Integrator/Importer)☆10Oct 15, 2018Updated 7 years ago
- A Scala library for locality sensitive hashing☆14Aug 1, 2018Updated 7 years ago
- ☆11Mar 27, 2024Updated last year
- As pessoas públicas enriqueceram entre duas eleições?☆10Dec 8, 2022Updated 3 years ago
- ☆41Jul 23, 2024Updated last year
- MemCachier Django usage example☆13Nov 29, 2018Updated 7 years ago
- Movie Recommendation System Using Spark ML, Akka and Cassandra☆12Oct 4, 2019Updated 6 years ago
- Conteúdo das aulas da turma 6 do bootcamp de engenharia de dados da How☆12Sep 16, 2021Updated 4 years ago
- Ansible playbook for managing Galaxy infrastructure. For the playbook managing Galaxy itself, see https://github.com/galaxyproject/usegal…☆12Updated this week
- ☆10May 24, 2024Updated last year
- Proof of concept of a big data cluster using open source tools☆11Apr 10, 2024Updated last year
- A Python + NLTK Text Mining Open Course // Curso aberto se utilizando de Python + NLTK para Mineração Textual☆11Feb 3, 2018Updated 8 years ago
- Sanic application fully integrated with Motor + UMongo☆10Aug 6, 2022Updated 3 years ago
- We aim to make it easier for biomedical researchers to access and customize synthetic sequence data for the purpose of sharing and testin…☆11Jul 22, 2019Updated 6 years ago
- Spending One Hundred days on blogging about cloud computing☆14Jul 12, 2022Updated 3 years ago
- A minimal buildpack for Pipenv.☆11Feb 13, 2019Updated 7 years ago
- My notes while learning datascience☆10May 26, 2018Updated 7 years ago
- Helium hotspot stats & leaderboards for your Discord server☆13Jan 4, 2022Updated 4 years ago
- Projeto Stack de dados OSS☆12Apr 8, 2025Updated 10 months ago
- Capstone Project: Predicting default in P2P lending☆12Feb 27, 2017Updated 9 years ago
- ☆10Dec 8, 2022Updated 3 years ago
- A simple python command line utility using the Vagalume API to search and show songs lyrics.☆10Dec 26, 2022Updated 3 years ago
- Ion Torrent SDK Docs☆10Dec 26, 2022Updated 3 years ago
- Example repository demonstrating python3.7 in travis-ci☆12Jan 30, 2019Updated 7 years ago
- Apify public actor for scraping Airbnb homes.☆11Dec 10, 2022Updated 3 years ago
- Kinesis Connector for Spark Structured Streaming☆11Dec 26, 2023Updated 2 years ago
- A bunch of low-level basic methods for data processing and monitoring with Scala Spark☆10Jun 29, 2018Updated 7 years ago
- Identifying unically Brazilian public persons☆11Oct 20, 2017Updated 8 years ago
- Using PostgresSQL to store and fetch data, tune queries, and design efficient database structures!☆12Mar 31, 2021Updated 4 years ago
- ☆12Jun 18, 2022Updated 3 years ago
- Scripts para realizar o download dos dados abertos do estado da Paraíba disponíveis em https://dados.pb.gov.br/☆10Dec 8, 2022Updated 3 years ago
- Apache Hadoop - Docker distribution based on CentOS 7 and Oracle Java 8☆12Feb 20, 2018Updated 8 years ago
- Convert English to morse code and vice versa.☆10Feb 3, 2024Updated 2 years ago