Repository used for Spark Trainings
☆54Apr 21, 2023Updated 2 years ago
Alternatives and similar repositories for spark-training
Users that are interested in spark-training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Jan 30, 2023Updated 3 years ago
- SBT plugins for publishing to Maven Central, shading and managing dependencies, reporting to Coveralls from TravisCI, and more☆14Nov 13, 2020Updated 5 years ago
- Docker image for Jupyter notebooks with PySpark☆27Aug 3, 2018Updated 7 years ago
- Import Salesforce data into Hadoop HDFS in Avro format☆23Jan 8, 2020Updated 6 years ago
- Spark with Scala example projects☆34Apr 17, 2019Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆14Feb 2, 2019Updated 7 years ago
- Apache Spark (PySpark) Practice on Real Data☆271Jan 31, 2020Updated 6 years ago
- Machine Learning and Data Analysis Case Studies using Spark.☆72Mar 22, 2021Updated 5 years ago
- Solved data engineering exercises using Pyspark☆15Aug 2, 2021Updated 4 years ago
- Scala Books☆12Mar 31, 2014Updated 11 years ago
- Code snippets and tutorials for working with social science data in PySpark☆418Aug 11, 2017Updated 8 years ago
- Databricks - Apache Spark™ - 2X Certified Developer☆265Jul 24, 2020Updated 5 years ago
- Cours sur Spark donné à Telecom☆13Oct 24, 2019Updated 6 years ago
- APIs written in Flask using a Heroku Postgres database to register a user and log into account . Deployed on Heroku☆10Dec 8, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Project for James' Apache Spark with Scala course☆125Jul 6, 2020Updated 5 years ago
- Extracting LinkedIn comments from any post and export it to Excel file☆23Oct 17, 2018Updated 7 years ago
- A demonstration of Jupyter Book functionality using QuantEcon Python programming source material.☆14Oct 30, 2020Updated 5 years ago
- ☆17Oct 20, 2020Updated 5 years ago
- Every Day Calendar inspired by Simone Giertz's project: https://www.kickstarter.com/projects/simonegiertz/the-every-day-calendar☆16Mar 18, 2026Updated last week
- Minimal example to setup a Jenkins-CI pipeline for data science projects on OpenShift in a couple of minutes.☆27Jan 7, 2025Updated last year
- Code, Examples, Templates and Scripts for DataWorksSummit 2017 Sydney Talk☆17Sep 19, 2017Updated 8 years ago
- Examples of diagrams using Mermaid: https://mermaid.js.org/intro/☆12Mar 25, 2023Updated 3 years ago
- envflag - flags for the environment☆15Jan 14, 2017Updated 9 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- HMM Tutorial☆12Apr 15, 2018Updated 7 years ago
- Vagrant project to spin up a single node VM running current versions of Hadoop, Hive and Spark☆66Feb 15, 2022Updated 4 years ago
- Updated repository☆157Nov 25, 2021Updated 4 years ago
- Miscellaneous Jupyter notebooks and slides for public talks☆11Jan 7, 2019Updated 7 years ago
- Create a responsive monthly calendar with events using vanilla Javascript.☆17May 20, 2021Updated 4 years ago
- Spark app to merge different schemas☆23Dec 21, 2020Updated 5 years ago
- My dotfiles.☆12Oct 10, 2025Updated 5 months ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- Football scouts from Cartola FC at a data lake with data warehouse and dashboard☆18Mar 17, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆56May 6, 2023Updated 2 years ago
- A Scalable Data Cleaning Library for PySpark.☆29Apr 4, 2019Updated 6 years ago
- A collection of data analysis projects done using PySpark via Jupyter notebooks.☆10Oct 8, 2022Updated 3 years ago
- Hands-On-Predictive-Analytics-with-Python☆15Jan 15, 2021Updated 5 years ago
- Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa …☆11Apr 21, 2023Updated 2 years ago
- DeckParser fornece programas para abrir e ler os dados dos programas NEWAVE, DECOMP e DESSEM☆16Jul 22, 2024Updated last year
- Scalable CDC Pattern Implemented using PySpark☆18Oct 8, 2025Updated 5 months ago