pysysops / docker-luigidLinks
Luigi Central Scheduler Server on Docker
☆11Updated 9 years ago
Alternatives and similar repositories for docker-luigid
Users that are interested in docker-luigid are comparing it to the libraries listed below
Sorting:
- An example to illustrate using Luigi to manage a data science workflow in Greenplum Database☆12Updated 7 years ago
- Gallery of Apache Zeppelin notebooks☆216Updated 6 years ago
- A project to help develop Luigi pipelines using Docker ✳️☆80Updated 4 years ago
- Code reference from my Qbox blog posts.☆87Updated 10 years ago
- [NOT MAINTAINED] Bubbles – Python ETL framework☆460Updated 8 years ago
- Python SDK for accessing Qubole Data Service☆51Updated 11 months ago
- Data Science box: Spark, Jupyter, R+RStudio, Zeppelin, Python 2 & 3, Java, Scala.☆39Updated 7 years ago
- Apache Toree quickstart tutorial☆29Updated 9 years ago
- Converting a zeppelin notebook in single programming language to respective script☆18Updated 5 years ago
- PyAthenaJDBC is an Amazon Athena JDBC driver wrapper for the Python DB API 2.0 (PEP 249).☆94Updated 2 years ago
- Apache Zeppelin on Kubernetes.☆28Updated 6 years ago
- A Python MapReduce and HDFS API for Hadoop☆241Updated 3 weeks ago
- Docker build for Zeppelin, a web-based Spark notebook☆221Updated 6 years ago
- ☆146Updated 9 years ago
- A simple examle for Python Kafka Avro☆86Updated 7 years ago
- Example unit tests for Apache Spark Python scripts using the py.test framework☆84Updated 9 years ago
- Quickstart PySpark with Anaconda on AWS/EMR☆52Updated 9 years ago
- A short guide for transitioning from Python to Scala☆65Updated 10 years ago
- Vagrant project to spin up a cluster of 4 32-bit CentOS6.5 Linux virtual machines with Hadoop v2.6.0 and Spark v1.1.1☆124Updated 10 years ago
- Python wrapper for the hadoop WebHDFS Rest API☆32Updated 10 years ago
- ☆70Updated 3 years ago
- Flask app to push/pull on Kafka over HTTP☆41Updated 10 years ago
- Dockerized setup for testing code on realistic hadoop clusters☆26Updated 5 years ago
- pyspark-cassandra is a Python port of the awesome @datastax Spark Cassandra connector. Compatible w/ Spark 2.0, 2.1, 2.2, 2.3 and 2.4☆69Updated last year
- Vagrant project to spin up a single virtual machine running current versions of Hadoop, Hive and Spark☆71Updated 7 years ago
- ☆525Updated last month
- Scripts used to setup a Spark cluster on EC2☆388Updated 8 years ago
- Vagrant projects for various use-cases with Spark, Zeppelin, IPython / Jupyter, SparkR☆34Updated 9 years ago
- An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parse…☆90Updated 10 years ago
- Docker image for python datascience container with NumPy, SciPy, Scikit-learn, Matplotlib, nltk, pandas packages installed.☆48Updated 6 years ago