A toolset to streamline running spark python on EMR
☆20Nov 16, 2016Updated 9 years ago
Alternatives and similar repositories for pyspark-emr
Users that are interested in pyspark-emr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Quickstart PySpark with Anaconda on AWS/EMR☆52Jan 9, 2017Updated 9 years ago
- Introductory interactive Jupyter tutorial providing details about ORMs in order to assist in the teaching of their use to computing scien…☆14Oct 21, 2025Updated 5 months ago
- ☆14Aug 10, 2021Updated 4 years ago
- A Python API client for Looker☆14Aug 2, 2018Updated 7 years ago
- ☆12Jun 3, 2016Updated 9 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆11Oct 11, 2022Updated 3 years ago
- Data validation library for PySpark 3.0.0☆33Nov 11, 2022Updated 3 years ago
- CLI tool to launch Spark jobs on AWS EMR☆67Oct 18, 2023Updated 2 years ago
- A Gentle introduction to Machine Learning with Apache Spark☆11Mar 2, 2026Updated last month
- A Hubot script for creating quick reminders through natural language.☆11Jun 29, 2017Updated 8 years ago
- An opinionated Kafka producer/consumer built on top of confluent-kafka-python/librdkafka☆28Apr 30, 2025Updated 11 months ago
- A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML☆15Dec 24, 2016Updated 9 years ago
- Gobblin is a distributed big data integration framework (ingestion, replication, compliance, retention) for batch and streaming systems.…☆11Jul 29, 2017Updated 8 years ago
- ☆12Oct 16, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Terraform Module to create a Apache Zookeeper cluster on AWS☆13Jan 3, 2022Updated 4 years ago
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆14Apr 14, 2023Updated 3 years ago
- Ambari service for RedHat FreeIPA☆11Sep 30, 2016Updated 9 years ago
- ☆10Jul 5, 2016Updated 9 years ago
- List of playbooks to manage Ambari☆13Oct 3, 2018Updated 7 years ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- A pyspark lib to validate data quality☆19Nov 11, 2022Updated 3 years ago
- This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark w…☆16Oct 3, 2025Updated 6 months ago
- Avro Schema Shredder is a REST API that enables storage of Avro Schemas in Apache Atlas. This API enables an organization to use Apache A…☆13Jan 11, 2017Updated 9 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Packer Template to build a AWS Apache Cassandra AMI☆10Jan 3, 2022Updated 4 years ago
- An example application to integrate Amazon API Gateway and Amazon Lambda.☆12Aug 5, 2015Updated 10 years ago
- Microservices with spring-boot and Machine Learning with Apache Spark ML☆13Sep 15, 2018Updated 7 years ago
- ☆12Apr 27, 2018Updated 7 years ago
- Subset Met Office MOGREPS-UK and UKV on AWS EC2☆12Oct 22, 2021Updated 4 years ago
- Jiraya - Simple Jira CLI☆17Dec 13, 2019Updated 6 years ago
- Custom Alerts for Ambari server☆12Jul 27, 2015Updated 10 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆48Jan 7, 2025Updated last year
- Rocksdb state storage implementation for Structured Streaming.☆17Oct 21, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- python implementation of the parquet columnar file format.☆21Mar 10, 2026Updated last month
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Jan 2, 2023Updated 3 years ago
- ☆10Feb 5, 2017Updated 9 years ago
- This is a fork of the Apache Flink Kinesis connector adding Enhanced Fanout support for Flink 1.8/1.11 on KDA.☆24Mar 1, 2026Updated last month
- Tool to migrate Prometheus 1.x data directories to the 2.0 format.☆14Jan 18, 2018Updated 8 years ago
- A set of modules aimed to manipulate policies on Apache Ranger.☆13Jan 21, 2019Updated 7 years ago
- ☆20Feb 14, 2018Updated 8 years ago