Repository used for Spark Trainings
☆54Apr 21, 2023Updated 3 years ago
Alternatives and similar repositories for spark-training
Users that are interested in spark-training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SBT plugins for publishing to Maven Central, shading and managing dependencies, reporting to Coveralls from TravisCI, and more☆14Nov 13, 2020Updated 5 years ago
- Collection of Databricks and Jupyter Notebooks☆22Feb 9, 2026Updated 4 months ago
- Spark with Scala example projects☆34Apr 17, 2019Updated 7 years ago
- Apache Spark (PySpark) Practice on Real Data☆270Jan 31, 2020Updated 6 years ago
- Machine Learning and Data Analysis Case Studies using Spark.☆72Mar 22, 2021Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Solved data engineering exercises using Pyspark☆17Aug 2, 2021Updated 4 years ago
- Memory consumption estimator for Scala/Java☆27Nov 24, 2014Updated 11 years ago
- Code snippets and tutorials for working with social science data in PySpark☆416Aug 11, 2017Updated 8 years ago
- Databricks - Apache Spark™ - 2X Certified Developer☆265Jul 24, 2020Updated 5 years ago
- Spark Streaming with Kafka and Wikipedia Edits☆12Feb 3, 2017Updated 9 years ago
- APIs written in Flask using a Heroku Postgres database to register a user and log into account . Deployed on Heroku☆10Dec 8, 2022Updated 3 years ago
- Scala utility to send mail☆14May 4, 2020Updated 6 years ago
- Project for James' Apache Spark with Scala course☆124Jul 6, 2020Updated 5 years ago
- Extracting LinkedIn comments from any post and export it to Excel file☆23Oct 17, 2018Updated 7 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Every Day Calendar inspired by Simone Giertz's project: https://www.kickstarter.com/projects/simonegiertz/the-every-day-calendar☆16May 28, 2026Updated last month
- Code for my presentation: Using PySpark to Process Boat Loads of Data☆20Oct 20, 2017Updated 8 years ago
- Examples of diagrams using Mermaid: https://mermaid.js.org/intro/☆12Mar 25, 2023Updated 3 years ago
- Code, Examples, Templates and Scripts for DataWorksSummit 2017 Sydney Talk☆17Sep 19, 2017Updated 8 years ago
- HMM Tutorial☆12Apr 15, 2018Updated 8 years ago
- Updated repository☆155Nov 25, 2021Updated 4 years ago
- Create a responsive monthly calendar with events using vanilla Javascript.☆17May 20, 2021Updated 5 years ago
- Spark app to merge different schemas☆23Dec 21, 2020Updated 5 years ago
- ☆11Apr 15, 2019Updated 7 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Everything about Apache Hive that is awesome☆13Dec 16, 2020Updated 5 years ago
- My dotfiles.☆12Oct 10, 2025Updated 8 months ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆56May 6, 2023Updated 3 years ago
- Football scouts from Cartola FC at a data lake with data warehouse and dashboard☆19Mar 17, 2022Updated 4 years ago
- A Scalable Data Cleaning Library for PySpark.☆29Apr 4, 2019Updated 7 years ago
- Build a Docker container to build, train and deploy fast.ai based Deep Learning models with Amazon SageMaker☆13Dec 15, 2018Updated 7 years ago
- ☆12Sep 25, 2024Updated last year
- Compare different technologies. No BS and all sources linked.☆14May 4, 2024Updated 2 years ago
- A collection of data analysis projects done using PySpark via Jupyter notebooks.☆10Oct 8, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Hands-On-Predictive-Analytics-with-Python☆15Jan 15, 2021Updated 5 years ago
- My GitHub blog: things you might be interested, and probably not...☆26May 18, 2019Updated 7 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Oct 8, 2025Updated 8 months ago
- ☆13Aug 5, 2024Updated last year
- This repository will soon contain all scripts and links to the annotated corpora of Tibetan.☆14Feb 4, 2025Updated last year
- Sentiment Analysis of a Twitter Topic with Spark Structured Streaming☆55Dec 12, 2018Updated 7 years ago
- Rasa Chatbot using Django backend and Sockets for communication☆12Dec 8, 2022Updated 3 years ago