This project introduces PySpark, a powerful open-source framework for distributed data processing. We explore its architecture, components, and applications for real-time data analysis.
☆45Sep 26, 2024Updated last year
Alternatives and similar repositories for Real-Time-PySpark
Users that are interested in Real-Time-PySpark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated 2 years ago
- Build and run Spark Structured Streaming pipelines in Hadoop - project using PySpark.☆13Jun 6, 2019Updated 6 years ago
- Simple project using pyflink, kafka and postgre containerized using Docker☆11Aug 26, 2024Updated last year
- ☆13Nov 4, 2020Updated 5 years ago
- capstone project for Dataengineer.io bootcamp Public Repo☆12Feb 20, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Codes related to data wrangling☆12Apr 12, 2020Updated 6 years ago
- Implementing best practices for PySpark ETL jobs and applications.☆2,094Jan 1, 2023Updated 3 years ago
- Building an PD, LGD and EAD Model for Financial Modeling.☆15Dec 19, 2023Updated 2 years ago
- This project demonstrates how to build and automate an ETL pipeline written in Python and schedule it using open source Apache Airflow or…☆20Aug 21, 2025Updated 8 months ago
- The Free AWS Certified Cloud Practitioner Study Course☆14Oct 15, 2019Updated 6 years ago
- ☆17May 26, 2023Updated 2 years ago
- My solutions for the Udacity Data Engineering Nanodegree☆34Oct 14, 2019Updated 6 years ago
- MCP server that provides hourly weather forecasts using the AccuWeather API☆33Jan 1, 2025Updated last year
- This is the first project where we worked on apache spark, In this project what we have done is that we downloaded the datasets from KAGG…☆23Oct 14, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Starter application demonstrating how to connect a NestJS API to a PlanetScale MySQL database☆11Apr 12, 2023Updated 3 years ago
- TechnoSnag – Documentation, source code, configuration files, and scripts from my YouTube tutorials. Everything you need to follow along …☆73Updated this week
- ☆146Jan 31, 2023Updated 3 years ago
- ☆21Jun 7, 2024Updated last year
- TTS utility☆12Aug 2, 2020Updated 5 years ago
- An example integration between Flask and the Preact front end library.☆13Jun 20, 2022Updated 3 years ago
- Generate OpenAPI 3.x.x using Pydantic☆11Feb 9, 2023Updated 3 years ago
- ☆12Apr 28, 2020Updated 6 years ago
- Deep Learning for NLP☆12Dec 7, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A Repo of Time-series analysis techniques. Holt-Winter methods, ACF/PACF, MA, AR, ARMA, ARIMA, SARIMA, SARIMAX, VAR, VARMA, RNN Keras, Fa…☆19May 8, 2020Updated 5 years ago
- ☆37Apr 25, 2025Updated last year
- Advance JavaScript Notes Covered Many topic. Working On it Still In Complete☆23Apr 25, 2023Updated 3 years ago
- ☆30Mar 19, 2024Updated 2 years ago
- AWS ETL Pipleine☆29May 16, 2024Updated last year
- (Python, PySpark)☆11Nov 15, 2020Updated 5 years ago
- A walkthorugh and tutorial covering all common techniques used for face detection☆19Jul 12, 2024Updated last year
- ☆11Sep 6, 2019Updated 6 years ago
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆12Nov 18, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Opensource repository to create html reports using the same structure as creating a streamlit dashboard☆15Mar 4, 2026Updated last month
- code snippet for analytics sessions☆34May 17, 2022Updated 3 years ago
- Haraka SMTP plugin for logging outbound traffic. Useful for storing audit information of delivered/bounced emails.☆16Jan 12, 2023Updated 3 years ago
- 🚀 A simple javascript template for rapid development of GitHub actions.☆17Feb 24, 2023Updated 3 years ago
- DataStream Schema☆14Updated this week
- Nyc_Taxi_Data_Pipeline - DE Project☆140Oct 21, 2024Updated last year
- Sample RAG pattern using Azure SQL DB, Langchain and Chainlit☆34Dec 3, 2024Updated last year