An example PySpark project with pytest
☆18Oct 13, 2017Updated 8 years ago
Alternatives and similar repositories for gill
Users that are interested in gill are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Slide and notebook used for my talk on vaex at the Pandas summit 2019 @ Lodnon☆11Jun 13, 2019Updated 6 years ago
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Jul 31, 2023Updated 2 years ago
- Scrape and parse data from detail page of OLX☆37Dec 11, 2016Updated 9 years ago
- Set of tools to help with delta lake house architecture patterns☆13Feb 9, 2021Updated 5 years ago
- A Storm based web crawler with Cassandra backend☆28Nov 7, 2013Updated 12 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Examples for Apache Oozie book☆18May 30, 2016Updated 9 years ago
- Python implementation of Association Rule Mining☆11Apr 26, 2024Updated last year
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10May 12, 2023Updated 2 years ago
- A boilerplate for writing PySpark Jobs☆395Jan 21, 2024Updated 2 years ago
- Play with the Spark, Spark streaming and DataFrame API.☆12Jun 26, 2015Updated 10 years ago
- Ontology dataset for open_numbers namespace☆10Feb 27, 2026Updated last month
- ☆23Sep 13, 2016Updated 9 years ago
- ☆10Nov 12, 2022Updated 3 years ago
- Using JRecord to build a mapred and mapreduce inputformat for HDFS, MAPREDUCE, PIG, HIVE, Spark, ...☆19Dec 7, 2017Updated 8 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Ambari Service definition for deploying R & RHadoop libraries☆18Aug 3, 2015Updated 10 years ago
- ☆21Oct 1, 2015Updated 10 years ago
- An example of SparkConnect extension.☆15Mar 5, 2024Updated 2 years ago
- ☆48Feb 4, 2018Updated 8 years ago
- Java program for producing and consuming messages from Kafka☆14Aug 5, 2014Updated 11 years ago
- Speak Slack notifications and process Slack slash commands☆15Dec 20, 2018Updated 7 years ago
- ☆26Feb 22, 2026Updated last month
- ☆23Jun 18, 2017Updated 8 years ago
- List of Issuer Identification Numbers☆15Aug 7, 2013Updated 12 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆16Apr 9, 2019Updated 6 years ago
- Write property based tests easily on spark dataframes☆20Jan 19, 2024Updated 2 years ago
- Docker Images with Databricks Connect Ready to go☆24Dec 26, 2023Updated 2 years ago
- This project is for examples of how to use Zeppelin. https://github.com/apache/incubator-zeppelin☆25Jan 27, 2016Updated 10 years ago
- Spark style guide☆272Sep 30, 2024Updated last year
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable fro…☆29Jul 7, 2022Updated 3 years ago
- Machine Learning over Twitter's stream. Using Apache Spark, Web Server and Lightning Graph server.☆27Jun 19, 2016Updated 9 years ago
- This collection of general purpose python magic was too good to keep for ourselves!☆20Jan 9, 2026Updated 2 months ago
- Anomaly Detection Pipeline on Azure Databricks☆28Jul 29, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Parallel Iterative Algorithm (SGD) on Hadoop's YARN framework☆42Jan 30, 2013Updated 13 years ago
- Python notebooks analyzing campaign finance and lobbying activity data from California Secretary of State’s CAL-ACCESS database☆22Mar 3, 2018Updated 8 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆61Sep 4, 2023Updated 2 years ago
- Embed any webapp/website as Ambari view!☆25Feb 26, 2016Updated 10 years ago
- Example playbooks for Ansible☆56Nov 6, 2015Updated 10 years ago
- Repository with my talks (from Nov 2019 onwards).☆15Jun 10, 2021Updated 4 years ago
- Delta lake and filesystem helper methods☆50Feb 29, 2024Updated 2 years ago