Tagar / stuffLinks
various scripts
☆20Updated 3 years ago
Alternatives and similar repositories for stuff
Users that are interested in stuff are comparing it to the libraries listed below
Sorting:
- HADOOP-CLI is an interactive command line shell that makes interacting with the Hadoop Distribted Filesystem (HDFS) simpler and more intu…☆36Updated this week
- Conversion utility from Zeppelin notes to Jupyter notebooks.☆43Updated 6 years ago
- Pure Python wrapper for the Hadoop WebHDFS Rest API☆52Updated 5 years ago
- ☆16Updated last year
- Jupyter Notebook extension for Apache Spark integration☆191Updated 5 years ago
- Dockerized setup for testing code on realistic hadoop clusters☆26Updated 5 years ago
- Monitor Apache Spark from Jupyter Notebook☆172Updated 3 years ago
- Example for experimenting with how JupyterHub can be configured to work with Kerberos☆33Updated 8 years ago
- hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.☆28Updated 8 years ago
- Docker images used internally by various Teradata projects for automation, testing, etc☆39Updated 8 years ago
- Utilities to work with Scala/Java code with py4j☆40Updated 2 years ago
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- Lightweight proxy to expose the UI of an Apache Spark cluster that is behind a firewall☆98Updated 5 years ago
- Ansible playbook for automated HDP 2.x deployment install with Kerberos☆19Updated 9 years ago
- User Defined Extensions (UDX) to the Vertica Analytic Database☆118Updated 3 years ago
- Convert a CSV fle to ORCFile☆26Updated 6 years ago
- Databricks Migration Tools☆43Updated 4 years ago
- Tools to deploy Hadoop on EMC Isilon☆17Updated 9 years ago
- 🥪💾 A sample of data from the `jaffle-shop-generator` that powers the Jaffle Shop spanning one year.☆14Updated last year
- Collection of tools for bootstrapping Apache Ambari & deploying clusters☆83Updated 6 years ago
- Paper: A Zero-rename committer for object stores☆20Updated 2 months ago
- Gallery of Apache Zeppelin notebooks☆216Updated 6 years ago
- A simplified, autogenerated API client interface using the databricks-cli package☆59Updated 2 years ago
- JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook☆92Updated 3 years ago
- Create HTML profiling reports from Apache Spark DataFrames☆197Updated 6 years ago
- A Python client for Apache Livy, enabling use of remote Apache Spark clusters.☆70Updated 4 years ago
- ☆54Updated 8 years ago
- An example PySpark project with pytest☆18Updated 8 years ago
- A Spark cluster setup running on Docker containers☆61Updated 6 years ago
- Cloudera Director sample code☆61Updated 6 years ago