fscm / terraform-module-aws-sparkLinks
Terraform Module to create a Apache Spark cluster on AWS
☆16Updated 4 years ago
Alternatives and similar repositories for terraform-module-aws-spark
Users that are interested in terraform-module-aws-spark are comparing it to the libraries listed below
Sorting:
- Terraform module for a PostgreSQL-backed Apache Airflow instance☆24Updated 7 years ago
- T4 is now in production as Quilt 3☆64Updated 6 years ago
- CLI tool to launch Spark jobs on AWS EMR☆67Updated 2 years ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Updated last year
- ☕⛵WIP PySpark dependency management☆22Updated 7 years ago
- Deployment tools/scripts for Metaflow!☆56Updated 2 years ago
- Ansible role to deploy and configure Airflow☆41Updated last week
- Repository for makeinga a GitHub Actions for deploying to Kubeflow.☆35Updated 3 years ago
- Puppet module to provision Airbnb's Airflow☆20Updated 3 years ago
- Airflow workflow management platform chef cookbook.☆70Updated 6 years ago
- An emerging widget for exploring RESTful APIs in Jupyter notebooks.☆29Updated 7 months ago
- An ML project template with sensible defaults☆39Updated 3 years ago
- Public repository for the Search Fundamentals course taught by Daniel Tunkelang and Grant Ingersoll. Available at https://corise.com/cour…☆45Updated 2 years ago
- The open source version of the Amazon Redshift Getting Started Guide.☆15Updated 2 years ago
- The open source version of the Amazon Athena documentation. To submit feedback & requests for changes, submit issues in this repository, …☆84Updated 2 years ago
- Small Docker image with Python Machine Learning tools (~180MB) https://hub.docker.com/r/frolvlad/alpine-python-machinelearning/☆82Updated 9 months ago
- Example templates for the delivery of custom ML solutions to production so you can get started quickly without having to make too many de…☆74Updated last year
- Build and deploy a serverless data pipeline on AWS with no effort.☆111Updated 2 years ago
- An example PySpark project with pytest☆17Updated 8 years ago
- The sane way of building a data layer in Airflow☆24Updated 6 years ago
- A Github API client to extract events and actions, and load into a database☆28Updated 4 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆39Updated 5 years ago
- Unit and integration testing with PySpark can be tough to figure out, let's make that easier.☆23Updated 10 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated last year
- Data Catalog for Databases and Data Warehouses☆36Updated 2 years ago
- Getting Great Expectations setup to run on DataBricks with Spark Dataframes.☆13Updated 3 years ago
- CLI tool for syncing a Databricks folder structure with a local git repo.☆17Updated last year
- Composable filesystem hooks and operators for Apache Airflow.☆17Updated 4 years ago
- 🎯 kettle is a CLI tool for creating and deploying cloud functions & docker containers for machine learning☆32Updated 3 years ago
- A solution enabling customers to quickly deploy an architecture to identify and mask sensitive health data☆26Updated 2 years ago