harrystech / notebook-server
Infrastructure code to run notebooks on some EC2 nodes
☆10Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for notebook-server
- Cloudformation templates for deploying Airflow in ECS☆40Updated 5 years ago
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆76Updated 6 years ago
- A collection of airflow sample workflows for data processing on aws☆12Updated 6 years ago
- Unit and integration testing with PySpark can be tough to figure out, let's make that easier.☆22Updated 9 years ago
- Create Parquet files from CSV☆67Updated 7 years ago
- Ansible role for the installation of Apache Airflow☆18Updated 4 months ago
- CLI tool to launch Spark jobs on AWS EMR☆67Updated last year
- Deploy dask-distributed on google container engine using kubernetes☆40Updated 5 years ago
- run SQL queries on AWS Athena from jupyter notebooks☆19Updated 5 years ago
- AWS bootstrap scripts for Mozilla's flavoured Spark setup.☆47Updated 4 years ago
- Shoprunner Terraform provider - Open Source initiative☆37Updated 5 years ago
- ☆53Updated last year
- Simple multi-threaded Kinesis Poster and Worker Python examples☆69Updated 9 years ago
- Scripts and instructions to facilitate running Deep Learning Tasks on Amazon EMR☆63Updated last year
- An extension for Jupyter notebooks that allows running notebooks inside a Docker container and converting them to runnable Docker images.☆28Updated last year
- A Terraform template for provisioning Apache Airflow workflows on AWS ECS Fargate☆14Updated 4 years ago
- PyAthenaJDBC is an Amazon Athena JDBC driver wrapper for the Python DB API 2.0 (PEP 249).☆95Updated last year
- Deploy dask on YARN clusters☆69Updated 3 months ago
- Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabyt…☆135Updated 2 years ago
- Design pattern for orchestrating an incremental data ingestion pipeline using AWS Step Functions from an on premise location into an Amaz…☆28Updated 5 years ago
- transformpy is a Python 2/3 module for doing transforms on "streams" of data☆29Updated 7 years ago
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated last year
- Example for an airflow plugin☆49Updated 8 years ago
- AWS lambda functions - utilities☆12Updated 7 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year