VinayChaudhari1996/pyspark-dataframe-made-easy

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/VinayChaudhari1996/pyspark-dataframe-made-easy)

VinayChaudhari1996 / pyspark-dataframe-made-easy

pyspark dataframe made easy

☆16

Alternatives and similar repositories for pyspark-dataframe-made-easy

Users that are interested in pyspark-dataframe-made-easy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rkcharlie / Spark_USE_CASE
View on GitHub
☆20Aug 17, 2019Updated 6 years ago
syedhassaanahmed / databricks-notebooks
View on GitHub
Collection of Databricks and Jupyter Notebooks
☆22Feb 9, 2026Updated 5 months ago
antimoz-om / Antimoz
View on GitHub
A data engineering pipeline for digital marketers.
☆11Dec 21, 2018Updated 7 years ago
Alexmhack / Django-Rasa-Sockets
View on GitHub
Rasa Chatbot using Django backend and Sockets for communication
☆12Dec 8, 2022Updated 3 years ago
itversity / retail_db_json
View on GitHub
☆14Sep 14, 2021Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
LeonardoEmili / stock-price-forecasting
View on GitHub
Distributed stock price forecasting system to predict S&P 500 stock prices.
☆11Nov 12, 2021Updated 4 years ago
RishiSankineni / Machine-Learning-Pipeline-LR-Pyspark
View on GitHub
Power Plant ML Pipeline Application - Apache Spark
☆12Dec 12, 2016Updated 9 years ago
adityajain10 / pyspark-mlib-based-stock-predictor
View on GitHub
PredictorFinc is a scalable supervised machine learning model the predicts stock price change through Decision Tree Regressor using data …
☆12Sep 5, 2023Updated 2 years ago
prakashdontaraju / google-cloud-ecommerce
View on GitHub
ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipelin…
☆11Mar 9, 2022Updated 4 years ago
minzhang-1 / PointHop-PointHop2_Spark
View on GitHub
A fast and low memory requirement version of PointHop and PointHop++, which is built upon Apache Spark.
☆10Jul 14, 2020Updated 6 years ago
mateuspicanco / project-atlas-sao-paulo
View on GitHub
A project for the development of rich geospatial data from the city of São Paulo for use in Machine Learning models.
☆12Jul 4, 2021Updated 5 years ago
AWS-Big-Data-Projects / Run-a-Spark-job-within-Amazon-EMR
View on GitHub
Run a Spark job within Amazon EMR
☆12Sep 12, 2020Updated 5 years ago
camposvinicius / gcp-etl
View on GitHub
This is a pipeline of an ETL application in GCP with open airport code data, which you can find here: https://datahub.io/core/airport-cod…
☆15Nov 15, 2021Updated 4 years ago
francescotescari / noiseprint2
View on GitHub
noiseprint2 is a porting of noiseprint to tensorflow 2 and keras
☆12Feb 20, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sbl-sdsc / mmtf-proteomics
View on GitHub
Methods for mapping proteomics data on 3D protein structure.
☆15Jan 18, 2020Updated 6 years ago
zekeriyyaa / Traffic-Data-Analysis-with-Apache-Spark-Based-on-Mobile-Robot-Data
View on GitHub
Mobile robot data were analyzed with Apache-Spark to extract five different statistical result such as travel time, waiting time, average…
☆15Apr 5, 2022Updated 4 years ago
ibaiGorordo / Tensorflow-Mobile-Generic-Object-Localizer
View on GitHub
Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.
☆16Sep 18, 2021Updated 4 years ago
dcyoagames / sissylifecyoa
View on GitHub
Sissy Life CYOA
☆13Sep 4, 2015Updated 10 years ago
cvilla87 / PySpark-ETL-Telecom
View on GitHub
Jupyter Notebook showing how to process Telecom datasets using PySpark (SparkSQL and DataFrames) and plotting the results using Matplotli…
☆17Dec 3, 2018Updated 7 years ago
chaithanya21 / Sentiment-Analysis-using-Pyspark-on-Multi-Social-Media-Data
View on GitHub
In this mini-project i have chosen to do sentiment analysis of social media websites such as twitter and reddit to gain insights into the…
☆12Mar 5, 2020Updated 6 years ago
big-data-lab-team / accident-prediction-montreal
View on GitHub
☆12Dec 8, 2022Updated 3 years ago
rvilla87 / ETL-PySpark
View on GitHub
ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)
☆17Dec 18, 2018Updated 7 years ago
Ceci-Aguilera / habaneras_de_lino_api
View on GitHub
Version 1 of Habaneras de Lino is an online ecommerce. This repo contains the backed api of the website using Django and Django Rest Fram…
☆13Dec 16, 2022Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
martandsingh / ApacheSpark
View on GitHub
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…
☆105Sep 26, 2025Updated 9 months ago
absognety / atomic-scala
View on GitHub
Atomic Scala Book Solutions - for Beginners and first time Functional Programmers
☆12Mar 10, 2020Updated 6 years ago
innat / Transfer-Learning-PySpark
View on GitHub
Multi-Class Classification | Transfer Learning With PySpark
☆13Nov 12, 2019Updated 6 years ago
Kuntal-G / BigData-Analytics
View on GitHub
Analytics projects using Big Data eco-systems (Hadoop, Spark, Storm)
☆17Dec 27, 2021Updated 4 years ago
Martialhimanshu / GaanaSuno
View on GitHub
GaanaSuno is an application that lets users upload, store and play all of your music from the cloud. Additionally, a user can comment and…
☆12Aug 18, 2018Updated 7 years ago
MHassaanButt / Flight-Delays-Prediction
View on GitHub
In this project, I used Decision Tree Learning Model as the main algorithm to build the model. Due to the big amount of flight data, we i…
☆12Dec 21, 2021Updated 4 years ago
ehsanmok / sparkling-titanic
View on GitHub
Training models with Apache Spark, PySpark for Titanic Kaggle competition
☆14Sep 23, 2016Updated 9 years ago
san089 / Cloudera_Material
View on GitHub
Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collab…
☆42Apr 21, 2020Updated 6 years ago
aurelienmorgan / abnormal_vibrations_watchdog
View on GitHub
Time Series Anomaly detection. The monitored signal is made-up of machinery vibration sensor measurements.
☆18Dec 7, 2020Updated 5 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
coin-or / GAMSlinks
View on GitHub
Links between GAMS (General Algebraic Modeling System) and solvers
☆13Oct 21, 2025Updated 8 months ago
DivyaKarade / Deep-learning-classification-based-model-for-screening-compounds-with-hERG-inhibitory-activity
View on GitHub
Developing a Deep learning classification-based model for screening pharmaceutical compounds with hERG inhibitory activity (cardiotoxicit…
☆15Oct 2, 2024Updated last year
heroku-examples / analytics-with-kafka-redshift-metabase
View on GitHub
An example system that captures a large stream of product usage data, or events, and provides both real-time data visualization and SQL-b…
☆27Jan 11, 2023Updated 3 years ago
harshitsaini / Business-Analytics-Data-Mining
View on GitHub
Code Snippets & DataSets for Business Analytics & Data Mining/ Machine Learning Algorithms
☆14Apr 23, 2018Updated 8 years ago
sephib / dagster-graph-project
View on GitHub
Repo demonstrating a Dagster pipeline to generate Neo4j Graph
☆22May 6, 2021Updated 5 years ago
jlopezmalla / Flights
View on GitHub
scala and spark examples project
☆14Feb 19, 2018Updated 8 years ago
dylanzenner / business_closures_de_pipeline
View on GitHub
Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database
☆14Oct 26, 2021Updated 4 years ago