luisbelloch/data_processing_course

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/luisbelloch/data_processing_course)

luisbelloch / data_processing_course

Some class materials for a data processing course using PySpark

☆53

Alternatives and similar repositories for data_processing_course

Users that are interested in data_processing_course are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

arocki7 / terraform-aws-lamp
View on GitHub
Create LAMP Stack using terraform with AWS
☆11Feb 15, 2023Updated 3 years ago
NathanNeff / hadoop-examples
View on GitHub
Hadoop Examples
☆10Jul 1, 2022Updated 4 years ago
dkoepke / cassandra-python-driver
View on GitHub
Add gevent support to DataStax Python Driver for Apache Cassandra
☆11Jun 10, 2020Updated 6 years ago
mbh038 / UCSD-Big-Data
View on GitHub
☆11Dec 14, 2015Updated 10 years ago
okmich / hadoop-training-projects
View on GitHub
Projects from my Hadoop training sessions
☆16Feb 22, 2018Updated 8 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
arocki7 / ansible-centos7-lamp
View on GitHub
Ansible Playbook to create LAMP in CentOS 7 with Apache, MySQL, PHP.
☆10Dec 28, 2018Updated 7 years ago
Venkata09 / BigDataCertificationPrep
View on GitHub
All Certification and preparation, examples & others
☆11Oct 18, 2018Updated 7 years ago
elephantscale / learning-scala
View on GitHub
☆14Aug 24, 2021Updated 4 years ago
gautamborad / hdp-ansible
View on GitHub
Automated (Ansible) installation of HDP via Ambari Blueprint
☆16Mar 10, 2017Updated 9 years ago
Kuntal-G / BigData-Analytics
View on GitHub
Analytics projects using Big Data eco-systems (Hadoop, Spark, Storm)
☆17Dec 27, 2021Updated 4 years ago
aaronstone007 / Udacity-Data-Streaming
View on GitHub
Projects from Udacity Data Streaming Nanodegree
☆15Aug 14, 2023Updated 2 years ago
apache-spark-on-k8s / ansible
View on GitHub
Ansible playbooks for Apache Spark on kube
☆27Jul 20, 2017Updated 9 years ago
cartershanklin / hive-testbench
View on GitHub
Testbench for experimenting with Apache Hive at any data scale.
☆64Jul 10, 2017Updated 9 years ago
kradecki / infa
View on GitHub
Python API for Informatica PowerCenter (pmrep, pmcmd)
☆21Sep 17, 2017Updated 8 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
indiacloudtv / pyspark_on_google_colab
View on GitHub
PySpark Tutorial for Beginners on Google Colab: Hands-On Guide
☆17Sep 13, 2020Updated 5 years ago
manokhina / leetcode-common-questions
View on GitHub
Python and C++ implementation of the problems from Clean Code Handbook - LeetCode 50 Common Interview Questions
☆33Feb 6, 2023Updated 3 years ago
codingmarket07 / Invoice-Template-Design-20j22
View on GitHub
How to create the Invoice Template Design In HTML and CSS
☆11Jan 20, 2022Updated 4 years ago
RomainClaret / lfs-7.8
View on GitHub
Set of Shell scripts to automate Linux from Scratch, based on the book 7.8
☆31Jan 10, 2018Updated 8 years ago
feluelle / finance-data-builder
View on GitHub
Finance 🏦 Data Builder 🛠️ @ postgres 🐘
☆22Feb 11, 2021Updated 5 years ago
rbmayer / Udacity-Data-Engineering-Nanodegree
View on GitHub
Udacity Data Engineering Nanodegree Projects
☆11Sep 5, 2019Updated 6 years ago
zaratsian / Spark
View on GitHub
Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References
☆69Jan 21, 2019Updated 7 years ago
Sathiyarajan / devops-pipeline
View on GitHub
DevOps
☆16May 17, 2021Updated 5 years ago
jldbc / gutenberg
View on GitHub
A content-based recommender system for books using the Project Gutenberg text corpus
☆29Feb 20, 2017Updated 9 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
nadimbahadoor / learn-spark
View on GitHub
Examples To Help You Learn Apache Spark
☆76Oct 8, 2018Updated 7 years ago
vuthanhhai2302 / Applied-Pyspark
View on GitHub
My applied big data analytic project with pyspark.
☆10Sep 21, 2022Updated 3 years ago
hvanderlaan / rhcsa-rhce-lab-environment
View on GitHub
Lab environment based on vagrant to learn ex200/ex300 rhcsa/rhce
☆39Mar 16, 2017Updated 9 years ago
tilakthimmappa / pyraider
View on GitHub
Using PyRaider You can scan installed dependencies known security vulnerabilities. It uses publicly known exploits, vulnerabilities datab…
☆18May 18, 2022Updated 4 years ago
DIYBigData / spark-data-analysis-projects
View on GitHub
A collection of data analysis projects done using PySpark via Jupyter notebooks.
☆10Oct 8, 2022Updated 3 years ago
mmcloughlin / geohashbench
View on GitHub
Benchmarks to compare golang geohash implementations
☆12Aug 6, 2018Updated 7 years ago
dgryski / go-stampede
View on GitHub
Optimal cache stampede prevention
☆16May 11, 2017Updated 9 years ago
adornes / spark_python_ml_examples
View on GitHub
Spark 2.0 Python Machine Learning examples
☆99Oct 7, 2019Updated 6 years ago
kaantas / spark-twitter-sentiment-analysis
View on GitHub
Sentiment Analysis of a Twitter Topic with Spark Structured Streaming
☆54Dec 12, 2018Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lresende / ansible-spark-cluster
View on GitHub
Ansible roles to install an Spark Standalone cluster (HDFS/Spark/Jupyter Notebook) or Ambari based Spark cluster
☆62Jan 30, 2024Updated 2 years ago
ekampf / PySpark-Boilerplate
View on GitHub
A boilerplate for writing PySpark Jobs
☆393Jan 21, 2024Updated 2 years ago
alvintoh / udemy-hands-on-hadoop
View on GitHub
AlvinToh Learning Repository for The Ultimate Hands-On Hadoop - Tame your Big Data!
☆10May 23, 2018Updated 8 years ago
SunnyMarkLiu / MachineLearning-DeepLearning-Papers
View on GitHub
My Machine Learning & Deep Learning Papers Notes.
☆11Jul 17, 2018Updated 8 years ago
prakashdontaraju / google-cloud-ecommerce
View on GitHub
ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipelin…
☆11Mar 9, 2022Updated 4 years ago
sbl-sdsc / mmtf-pyspark
View on GitHub
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
☆68Mar 27, 2023Updated 3 years ago
testdrivenio / spark-docker-swarm
View on GitHub
running apache spark with docker swarm
☆34Feb 25, 2021Updated 5 years ago