pran4ajith/spark-twitter-streaming

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/pran4ajith/spark-twitter-streaming)

pran4ajith / spark-twitter-streaming

A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and Delta Lake.

☆29

Alternatives and similar repositories for spark-twitter-streaming

Users that are interested in spark-twitter-streaming are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

anthonywong611 / Batch-ETL-with-AWS-EMR-and-MWAA
View on GitHub
Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extrac…
☆10Jul 12, 2021Updated 5 years ago
supratim94336 / DataEngineeringCapstoneProject
View on GitHub
😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS
☆51Aug 23, 2019Updated 6 years ago
guidok91 / spark-movies-etl
View on GitHub
Spark data pipeline that processes movie ratings data.
☆31Updated this week
sizrailev / life-around-data-code
View on GitHub
Code snippets and tools published on the blog at lifearounddata.com
☆12Jan 19, 2020Updated 6 years ago
Apress / beginning-blockchain
View on GitHub
Source Code for 'Beginning Blockchain' by Bikramaditya Singhal, Gautam Dhameja, and Priyansu Sekhar Panda
☆10May 17, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
saeed349 / quant_infra
View on GitHub
Explore building an advanced infrastructure for enhancing QuantConnect with Snowflake, Databricks, Airflow & AWS. Learn the basics of qua…
☆15Jan 27, 2024Updated 2 years ago
vsouza / spark-kinesis-redshift
View on GitHub
Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark
☆11May 22, 2018Updated 8 years ago
luigiselmi / flink-kafka-consumer
View on GitHub
A consumer of a Kafka topic based on Flink
☆12Oct 5, 2022Updated 3 years ago
ably-labs / Realtime-ticket-booking-solution
View on GitHub
A simple demo showing how to use Ably and fastAPI to route messages into Kafka for stream processing
☆16Oct 12, 2021Updated 4 years ago
drisskhattabi6 / Real-Time-Twitter-Sentiment-Analysis
View on GitHub
This repo contains Big Data Project, its about "Real Time Twitter Sentiment Analysis via Kafka, Spark Streaming, MongoDB and Django Dashb…
☆47Jul 5, 2026Updated 3 weeks ago
vanhoangkha / aws-first-cloud-journey
View on GitHub
🚀 Complete AWS learning path for beginners - 45K+ community resource with hands-on labs, workshops, and certification guides
☆16Jun 13, 2026Updated last month
ibaiGorordo / Tensorflow-Mobile-Generic-Object-Localizer
View on GitHub
Python Tensorflow 2 scripts for detecting objects of any class in an image without knowing their label.
☆16Sep 18, 2021Updated 4 years ago
analyticsdurgesh / StreamCommerce-Lakehouse-360
View on GitHub
Production-style real-time e-commerce lakehouse with Kafka, Airflow, Databricks, Medallion architecture, data quality, quarantine, Terraf…
☆31May 30, 2026Updated last month
jerzygangi / forklift
View on GitHub
🚚 ETL for Spark and Airflow
☆25Mar 19, 2018Updated 8 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
xdevplatform / enterprise-scripts-python
View on GitHub
Sample Python scripts to help get started with the Twitter Enterprise APIs
☆25Feb 8, 2023Updated 3 years ago
itversity / mastering-redshift
View on GitHub
☆16Jul 31, 2022Updated 3 years ago
Apress / pro-power-bi-desktop
View on GitHub
Source code for 'Pro Power BI Desktop' by Adam Aspin
☆13Mar 28, 2017Updated 9 years ago
zhuiyuan616124 / ctr-paper-list
View on GitHub
☆12Nov 2, 2020Updated 5 years ago
shravan-kuchkula / dataEngineering
View on GitHub
A repo to track data engineering projects
☆14Nov 11, 2022Updated 3 years ago
chuqiaoshen / Git-Influencer
View on GitHub
Insight Data Engineering project: A platform built in HDFS, Spark and Airflow to help you to find social influencers from GitHub Net…
☆16May 21, 2024Updated 2 years ago
TrivadisPF / dockerfiles
View on GitHub
Dockerfiles maintained by Trivadis Platform Factory
☆12Mar 13, 2020Updated 6 years ago
longNguyen010203 / Youtube-Recommend-Master-ETL-Pipeline
View on GitHub
A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Doc…
☆25Nov 19, 2024Updated last year
AuFeld / Data_Engineering_Projects
View on GitHub
A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousin…
☆15Apr 29, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
asatrya / airflow-etl-learn
View on GitHub
This is a simple ETL using Airflow. First, we fetch data from API (extract). Then, we drop unused columns, convert to CSV, and validate (…
☆24Oct 12, 2019Updated 6 years ago
otuoma / dukapoint
View on GitHub
Simple, easy to use django-based point of sale system
☆16Jan 8, 2026Updated 6 months ago
jackgisby / tfl-bikes-data-pipeline
View on GitHub
Processing TfL data for bike usage with Google Cloud Platform.
☆45Jul 15, 2022Updated 4 years ago
kunal333 / E2ESynapseDemo
View on GitHub
☆27Mar 7, 2022Updated 4 years ago
vim89 / datapipelines-essentials-python
View on GitHub
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…
☆56May 6, 2023Updated 3 years ago
Apress / pro-power-bi-desktop-2018
View on GitHub
Source code for 'Pro Power BI Desktop' by Adam Aspin
☆22Dec 4, 2017Updated 8 years ago
joaocastanheira94 / rl_vrp
View on GitHub
Using reinforcement learning to solve VRP
☆10Jul 6, 2020Updated 6 years ago
Apress / power-query-for-power-bi-excel
View on GitHub
Source code for 'Power Query for Power BI and Excel' by Christopher Webb and Crossjoin Consulting Limited
☆19Aug 18, 2017Updated 8 years ago
mcastellin / yt-docker-tricks-examples
View on GitHub
A repository to store example files and projects for my YouTube series **Docker Development Tips & Tricks**
☆13Dec 1, 2021Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
arezamoosavi / AcidOnSpark-ETL
View on GitHub
Delta-Lake, ETL, Spark, Airflow
☆50Oct 9, 2022Updated 3 years ago
hyunjoonbok / PySpark
View on GitHub
PySpark functions and utilities with examples. Assists ETL process of data modeling
☆103Dec 3, 2020Updated 5 years ago
jonathanhayes / Tweepy-Twitter-Stream-Example
View on GitHub
Tweepy Stream Example
☆19Apr 23, 2019Updated 7 years ago
PacktPublishing / Data-Engineering-with-AWS-Cookbook
View on GitHub
Data Engineering with AWS Cookbook, published by Packt
☆27Apr 13, 2026Updated 3 months ago
alanchn31 / Loan-Default-Prediction
View on GitHub
Loan Default Prediction using PySpark, with jobs scheduled by Apache Airflow and Integration with Spark using Apache Livy
☆22Dec 26, 2020Updated 5 years ago
kenhanscombe / project-postgres
View on GitHub
Udacity data engineering nanodegree project
☆22Nov 8, 2019Updated 6 years ago
vinniepsychosis / ETL-Apple-Health
View on GitHub
This project involves an ETL (Extract, Transform, Load) process to analyze sleep data exported from Apple Health
☆29Apr 29, 2023Updated 3 years ago