☆18Aug 20, 2017Updated 8 years ago
Alternatives and similar repositories for learning-apache-spark
Users that are interested in learning-apache-spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Developing a Lambda Architecture pipeline using Apache Kafka, Spark Structured Streaming, Redshift, S3, Python☆22Mar 8, 2020Updated 6 years ago
- Entity resolution, also known as Data Matching or Record linkage is the task of finding a data set that refer to the same or similar real…☆33Apr 8, 2025Updated last year
- PawarBI☆31May 11, 2023Updated 2 years ago
- Source Code for 'Advanced Data Analytics Using Python' by Sayan Mukhopadhyay☆67May 23, 2018Updated 7 years ago
- Atomic Scala Book Solutions - for Beginners and first time Functional Programmers☆12Mar 10, 2020Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- On-demand port forwarding to k8s.☆24Apr 10, 2026Updated last week
- This a simple Python daemon to monitor your Impala nodes.☆10Apr 13, 2021Updated 5 years ago
- Repository that showcases problems with Kafka rebalancing and explains how to fix them. Please visit our blog article to learn what Kafka…☆12Aug 21, 2020Updated 5 years ago
- My Raspberry Pi installation at home.☆11Mar 16, 2024Updated 2 years ago
- ☆10Dec 5, 2022Updated 3 years ago
- Spark library for doing exploratory data analysis in a scalable way☆43Jan 17, 2016Updated 10 years ago
- Spark implementation of Slowly Changing Dimension type 2☆11Jan 8, 2019Updated 7 years ago
- POC for all the stack of big data (kafka, spark, cassandra, hdfs, docker, springboot)☆12Dec 16, 2022Updated 3 years ago
- Distributed Data Systems with Azure Databricks, published by Packt☆12Jan 18, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Hackerank Programming Challenges☆10May 8, 2021Updated 4 years ago
- Implementation of java.time for Scala.js and Scala Native☆16Mar 27, 2026Updated 3 weeks ago
- Interactive notebooks containing demonstration code of the splink library☆41Mar 4, 2026Updated last month
- Azure Synapse Analytics Samples☆14Feb 15, 2023Updated 3 years ago
- A clean, modern, and fully responsive HTML résumé (CV) template☆13Mar 13, 2026Updated last month
- A Guide to apache maven, httpclient, tomcat, ant and tiles.☆13Jul 23, 2018Updated 7 years ago
- ☆13Dec 5, 2022Updated 3 years ago
- docs, codes and resources to prepare for the CRT020: Databricks Certified Associate Developer for Apache Spark 2.4 with Python 3 certific…☆10Sep 25, 2019Updated 6 years ago
- Java OutOfMemory Example☆11Jun 19, 2021Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆13Jul 15, 2023Updated 2 years ago
- Two-day level 300 Azure Synapse Analytics workshop☆11Mar 16, 2021Updated 5 years ago
- Auto-fixing error due to version upgrade, good practice etc.☆11Sep 5, 2020Updated 5 years ago
- This is a list of YAML file examples for Docker, Kubernetes, Ansible. Also includes a Python script.☆10Jan 12, 2021Updated 5 years ago
- powershell_profile.ps1☆14Feb 11, 2026Updated 2 months ago
- Apache Kafka Overview☆12Jun 9, 2023Updated 2 years ago
- All my leet code solutions in Java☆11Aug 9, 2021Updated 4 years ago
- Reusable Python classes that extend open source PySpark capabilities. Examples of implementation is available under notebooks of repo htt…☆13Nov 1, 2024Updated last year
- Distributed Bayesian Entity Resolution in Apache Spark☆59Jun 10, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Data pipeline project using Data Factory, Databricks and Cosmosdb Graph, deployed using Azure DevOps, secured using firewalls and Azure A…☆11Dec 14, 2022Updated 3 years ago
- Delta Lake Examples☆11Apr 24, 2020Updated 5 years ago
- Example project on how to do state recovery in Apache Flink using Apache Avro☆12May 7, 2018Updated 7 years ago
- Sample demo to deploy an Apache Kafka cluster and monitor it using Strimzi, Grafana and Prometheus operators.☆10May 18, 2021Updated 4 years ago
- Some Avro operations in Scala☆10Mar 19, 2026Updated last month
- Debug GitHub Actions workflows locally — with breakpoints. No more blind YAML commits.☆46Apr 4, 2026Updated 2 weeks ago
- A Pure Black Intelij Theme☆17Mar 27, 2026Updated 3 weeks ago