gregbaker / spark-celery
Helper to allow Python Celery tasks to do work in a Spark job.
☆27Updated 2 years ago
Alternatives and similar repositories for spark-celery:
Users that are interested in spark-celery are comparing it to the libraries listed below
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- Code reference from my Qbox blog posts.☆87Updated 9 years ago
- PySpark for Elastic Search☆55Updated 8 years ago
- A wrapper for libhdfs3 to interact with HDFS from Python☆136Updated 4 years ago
- Pure Python wrapper for the Hadoop WebHDFS Rest API☆52Updated 4 years ago
- PredictionIO Python SDK☆196Updated 6 years ago
- REST-like API exposing Airflow data and operations☆61Updated 6 years ago
- Flask app to push/pull on Kafka over HTTP☆41Updated 10 years ago
- python library for interacting with SolrCloud☆36Updated 4 years ago
- Phoenix database adapter for Python (migrated to the Apache Phoenix repo)☆26Updated 3 years ago
- Example for an airflow plugin☆49Updated 8 years ago
- Cubes OLAP Examples☆74Updated 6 years ago
- Python language Plugin for elasticsearch☆103Updated 6 years ago
- Apache (Py)Spark type annotations (stub files).☆117Updated 2 years ago
- Docker compose files for various kafka stacks☆32Updated 7 years ago
- A plugin for Apache Airflow that allows you to manage the users that can login☆14Updated 5 years ago
- AMQP data source for dstream (Spark Streaming)☆26Updated 3 years ago
- A tool and library for easily deploying applications on Apache YARN☆143Updated last year
- A curated list of all the awesome examples, articles, tutorials and videos for Apache Airflow.☆96Updated 4 years ago
- Google Spreadsheets datasource for SparkSQL and DataFrames☆57Updated last year
- Docker image for Apache Spark☆76Updated 5 years ago
- A Python MapReduce and HDFS API for Hadoop☆238Updated 2 months ago
- Python client for Spark Jobserver Rest API☆39Updated 5 years ago
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆83Updated 5 years ago
- Utilities to work with Scala/Java code with py4j☆40Updated last year
- Multiple node cluster on Docker for self development.☆93Updated 6 years ago
- Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.☆110Updated 2 years ago
- Python DB-API client for Presto☆238Updated last year
- API and command line interface for HDFS☆272Updated 6 months ago
- ☆54Updated 6 years ago