I am using confluent Kafka cluster to produce and consume scraped data. In this project, I've created a real-time data pipeline that utilizes Kafka to scrape, process, and load data onto S3 in JSON format. With a producer-consumer architecture, I ensure that the data is in the right format for loading onto S3 by performing minor transformations
☆29May 2, 2023Updated 2 years ago
Alternatives and similar repositories for real-time_crypto_data_pipeline_using_kafka
Users that are interested in real-time_crypto_data_pipeline_using_kafka are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This project involves an ETL (Extract, Transform, Load) process to analyze sleep data exported from Apple Health☆29Apr 29, 2023Updated 2 years ago
- sql-for-data-engineering-course☆18May 12, 2023Updated 2 years ago
- ☆19May 27, 2023Updated 2 years ago
- In this project, we will build and ETL(Extract,Transform,Load) pipeline using the Spotify API on AWS. The pipeline will retrieve data fro…☆25May 6, 2023Updated 2 years ago
- ☆146Jan 31, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from loc…☆23May 14, 2022Updated 3 years ago
- Apartments Data Pipeline using Airflow and Spark.☆24Mar 28, 2022Updated 4 years ago
- Running ECS task for ML prediction orchestrated by Airflow☆14May 4, 2023Updated 2 years ago
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆52Aug 23, 2019Updated 6 years ago
- Pipeline that extracts data from the Spotify API to build a more detailed version of Spotify Wrapped☆49Mar 13, 2026Updated last month
- A simple cli tool that deletes files matching an extension within a given directory structure.☆12Sep 27, 2023Updated 2 years ago
- ☆16May 29, 2023Updated 2 years ago
- RedditR for Content Engagement and Recommendation☆18Dec 21, 2017Updated 8 years ago
- Capstone Project for the IBM Data Engineering Professional Certification.☆13Mar 7, 2022Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Monoscope's Golang client SDK.☆20Mar 1, 2026Updated last month
- This project focuses on building a robust data pipeline using Apache Airflow to automate the ingestion of weather data from the OpenWeath…☆22Feb 3, 2026Updated 2 months ago
- Pyspark Spotify ETL☆17Aug 19, 2021Updated 4 years ago
- A highly scalable microservice to handle WhatsApp, SMS and email-based notifications.☆20Mar 29, 2021Updated 5 years ago
- Data warehouse implementation for an e-commerce website “Infibeam” that sells digital and consumer electronics.☆23Jan 28, 2018Updated 8 years ago
- A dead simple Java REST API(without Spring) to transfer money between accounts☆15Aug 29, 2019Updated 6 years ago
- A Hindi Image Captioning system made completely with Transformers🤗☆10Apr 16, 2024Updated 2 years ago
- ☆17Dec 14, 2021Updated 4 years ago
- R toolbox to explore the TRON blockchain☆10Jul 18, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Data Engineering YouTube Analysis Project by Darshil Parmar☆238Dec 8, 2023Updated 2 years ago
- This project aims to build a traveling recommendation application using Google Places API and OpenAI LLM.☆11Mar 19, 2024Updated 2 years ago
- Deep Learning Projects on TensorFlow and Keras☆20Jun 13, 2024Updated last year
- Demonstration of using Apache Spark to build robust ETL pipelines while taking advantage of open source, general purpose cluster computin…☆25Aug 11, 2023Updated 2 years ago
- Develop ML models predict taxi trip duration in NYC. Ranked : Top 6% | RMSLE : 0.377 (Kaggle) | #DS☆17Jan 7, 2023Updated 3 years ago
- Project on belief embedding☆22Jun 4, 2025Updated 10 months ago
- ☆18Oct 31, 2020Updated 5 years ago
- ☆17Feb 9, 2023Updated 3 years ago
- ☆24May 5, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Deployed on expo go☆23May 22, 2022Updated 3 years ago
- A real-time reddit data streaming pipeline for sentiment analysis of various subreddits☆146Aug 23, 2023Updated 2 years ago
- ☆340Aug 13, 2024Updated last year
- Simple examples of serving HuggingFace models with TensorFlow Serving☆16Oct 21, 2023Updated 2 years ago
- *****PROJECT SPECIFICATION: Machine Learning Capstone Analysis Project***** This capstone project involves machine learning modeling and…☆15Mar 28, 2018Updated 8 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆29Aug 8, 2020Updated 5 years ago
- Data-Scenario is a repository designed to help professionals and students master data science by solving real-world problems. Each projec…☆16Oct 16, 2025Updated 6 months ago