A toolset to streamline running spark python on EMR
☆20Nov 16, 2016Updated 9 years ago
Alternatives and similar repositories for pyspark-emr
Users that are interested in pyspark-emr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Aug 10, 2021Updated 4 years ago
- ☆12Jun 3, 2016Updated 9 years ago
- ☆16Jun 27, 2020Updated 5 years ago
- Test suite to document the behavior of Spark☆21Apr 15, 2021Updated 5 years ago
- Material for the Jupytext+Papermill blog post☆31Jun 30, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- CLI tool to launch Spark jobs on AWS EMR☆67Oct 18, 2023Updated 2 years ago
- A Hubot script for creating quick reminders through natural language.☆11Jun 29, 2017Updated 8 years ago
- An opinionated Kafka producer/consumer built on top of confluent-kafka-python/librdkafka☆28Apr 23, 2026Updated last month
- A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML☆15Dec 24, 2016Updated 9 years ago
- A Spark datasource for the HadoopOffice library☆36Sep 29, 2025Updated 8 months ago
- 📝 A blog post about report generation and automation in python☆40Jul 26, 2019Updated 6 years ago
- Terraform Module to create a Apache Zookeeper cluster on AWS☆13Jan 3, 2022Updated 4 years ago
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆14Apr 14, 2023Updated 3 years ago
- Ambari service for RedHat FreeIPA☆11Sep 30, 2016Updated 9 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- python script to repair the primary range of a node in N discrete steps☆12Aug 3, 2018Updated 7 years ago
- List of playbooks to manage Ambari☆13Oct 3, 2018Updated 7 years ago
- Basic Spark utilities☆13Feb 20, 2025Updated last year
- Grafana Prometheus exporter☆10Oct 17, 2017Updated 8 years ago
- A pyspark lib to validate data quality☆19Nov 11, 2022Updated 3 years ago
- Docker compose files for various kafka stacks☆32Feb 24, 2018Updated 8 years ago
- Avro Schema Shredder is a REST API that enables storage of Avro Schemas in Apache Atlas. This API enables an organization to use Apache A…☆13Jan 11, 2017Updated 9 years ago
- API REST boilerplate using Spring Boot and Redis as database☆13Dec 26, 2018Updated 7 years ago
- Sample Docker Compose files for running Apache Ambari☆11Oct 29, 2018Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Due to lack of resources on how to deploy kafka with simple SASL authentication (just username and password) and how to write producer an…☆12Dec 29, 2021Updated 4 years ago
- Microservices with spring-boot and Machine Learning with Apache Spark ML☆13Sep 15, 2018Updated 7 years ago
- ☆12Apr 27, 2018Updated 8 years ago
- Cloudera Manager parcel and CSD to manage Cassandra NoSQL database☆14Nov 16, 2016Updated 9 years ago
- Subset Met Office MOGREPS-UK and UKV on AWS EC2☆12Oct 22, 2021Updated 4 years ago
- Example to create lineage in Atlas with sqoop and spark☆14Apr 5, 2017Updated 9 years ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Jul 11, 2018Updated 7 years ago
- Jiraya - Simple Jira CLI☆17Dec 13, 2019Updated 6 years ago
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆48Jan 7, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Thoughts on things I find interesting.☆17Dec 19, 2024Updated last year
- python implementation of the parquet columnar file format.☆21May 20, 2026Updated last week
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Jan 2, 2023Updated 3 years ago
- ☆10Feb 5, 2017Updated 9 years ago
- This is a fork of the Apache Flink Kinesis connector adding Enhanced Fanout support for Flink 1.8/1.11 on KDA.☆24Mar 1, 2026Updated 2 months ago
- Tool to migrate Prometheus 1.x data directories to the 2.0 format.☆14Jan 18, 2018Updated 8 years ago
- ☆20Feb 14, 2018Updated 8 years ago