CLI tool to launch Spark jobs on AWS EMR
☆67Oct 18, 2023Updated 2 years ago
Alternatives and similar repositories for sparksteps
Users that are interested in sparksteps are comparing it to the libraries listed below
Sorting:
- Docker compose files for various kafka stacks☆32Feb 24, 2018Updated 8 years ago
- Common post-estimation tasks for scikit-learn☆17Nov 30, 2016Updated 9 years ago
- Unit and integration testing with PySpark can be tough to figure out, let's make that easier.☆23Nov 3, 2015Updated 10 years ago
- Build the numpy/scipy/scikitlearn packages and strip them down to run in Lambda☆207Jul 12, 2018Updated 7 years ago
- AWS bootstrap scripts for Mozilla's flavoured Spark setup.☆47Feb 13, 2020Updated 6 years ago
- A Terraform module to create an Amazon Web Services (AWS) Elastic MapReduce (EMR) cluster.☆39Oct 21, 2019Updated 6 years ago
- ☆25Jun 25, 2018Updated 7 years ago
- Chef cookbook for the http://druid.io/☆10Apr 25, 2016Updated 9 years ago
- [UNMAINTAINED] A starter pack for creating a lightweight responsive web app for Fast.AI PyTorch models.☆16Dec 5, 2018Updated 7 years ago
- A pandas.DataFrame-based ORM.☆85Mar 15, 2022Updated 4 years ago
- Helm plugin to destroy all releases☆19Feb 27, 2018Updated 8 years ago
- (Weighted) Finite State Transducers for Scala NLP☆21Nov 15, 2014Updated 11 years ago
- A sentiment classifier tool and library trained on Twitter data☆22Nov 9, 2023Updated 2 years ago
- Interactive computing for complex data processing, modeling and analysis in Python 3☆79May 3, 2024Updated last year
- Building blocks of tensorflow architectures☆11Oct 14, 2019Updated 6 years ago
- S3-backed notebook manager for IPython☆29May 1, 2017Updated 8 years ago
- A collection of airflow sample workflows for data processing on aws☆12Dec 1, 2017Updated 8 years ago
- Puppet module to provision Airbnb's Airflow☆20Jun 8, 2022Updated 3 years ago
- Apache (Py)Spark type annotations (stub files).☆118Aug 17, 2022Updated 3 years ago
- This repository hold the Amazon Elastic MapReduce sample bootstrap actions☆613Jun 5, 2023Updated 2 years ago
- A software engineering framework to jump start your machine learning projects☆37Jan 24, 2026Updated last month
- A simple elasticsearch frontend for serving astrophysical simulation catalog data☆10Mar 14, 2026Updated last week
- Demonstration of using Apache Spark to build robust ETL pipelines while taking advantage of open source, general purpose cluster computin…☆25Aug 11, 2023Updated 2 years ago
- ☆16Apr 3, 2019Updated 6 years ago
- cli AWS Cloudwatch Logs Downloader☆26Jun 6, 2018Updated 7 years ago
- Serverless costs calculator for AWS Lambda☆12Oct 21, 2020Updated 5 years ago
- ☆16May 31, 2017Updated 8 years ago
- ELK 튜토리얼☆11Mar 15, 2023Updated 3 years ago
- Code for PyData Talk on "Classifying Products Based on Images and Text using Keras"☆30Apr 3, 2017Updated 8 years ago
- Apache Spark based ETL Engine☆71Oct 18, 2016Updated 9 years ago
- Python code to seasonally adjust data using the census X12-ARIMA program: http://www.census.gov/srd/www/x12a/☆11Mar 22, 2012Updated 14 years ago
- Legoo: A collection of automation modules to build analytics infrastructure☆20Jul 24, 2020Updated 5 years ago
- Tool to visualize data quickly with no brain usage for plot creation☆48Oct 29, 2025Updated 4 months ago
- Language support for Scala in Atom.☆51Jul 21, 2021Updated 4 years ago
- How to deploy a Machine Learning model for sentiment analysis in the Cloud with AWS Lambda.☆105Oct 22, 2020Updated 5 years ago
- Ansible role to deploy and configure Airflow☆41Updated this week
- Machine Learning Versioning made Simple☆38Jun 21, 2022Updated 3 years ago
- We are a group of volunteers who is trying to organize python meetups and workshops in Amsterdam.☆11Apr 4, 2020Updated 5 years ago
- A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support☆261Nov 3, 2017Updated 8 years ago