A data engineering project (Twitter monitor app)
☆87Jun 27, 2022Updated 3 years ago
Alternatives and similar repositories for spark_app_twitter
Users that are interested in spark_app_twitter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pipeline that extracts data from the Spotify API to build a more detailed version of Spotify Wrapped☆49Mar 13, 2026Updated 2 months ago
- This data project can be used as a take-home assignment to learn Pyspark and Data Engineering.☆19Feb 19, 2023Updated 3 years ago
- This is the final project that after participated the Data Engineering Zoomcamp☆11Apr 4, 2022Updated 4 years ago
- Data Engineering Project to Extract and Process Solana Reddit Data☆40Feb 3, 2024Updated 2 years ago
- This repository is to show my Data Analytics & Engineering skills, share projects, and track my progress.☆66Jun 25, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from loc…☆24May 14, 2022Updated 4 years ago
- Big Data Engineering & Analytics Project☆36Nov 6, 2020Updated 5 years ago
- RedditR for Content Engagement and Recommendation☆18Dec 21, 2017Updated 8 years ago
- Pipeline that extracts data from Crinacle's Headphone and InEarMonitor databases and finalizes data for a Metabase Dashboard. The dashboa…☆267Jan 1, 2023Updated 3 years ago
- Fastify and MessagePack, together at last. Uses @msgpack/msgpack by default.☆10May 18, 2021Updated 5 years ago
- A Data Engineering project. Repository for backend infrastructure and Streamlit app files for a Premier League Dashboard.☆255Dec 19, 2025Updated 5 months ago
- Data pipeline that scrapes Rust cheater Steam profiles☆54Feb 13, 2022Updated 4 years ago
- My first attempt at a rough ETL pipeline; technologies include spark, GCS, prefect orchestration, and terraform☆14Oct 12, 2022Updated 3 years ago
- A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!☆877Apr 16, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆20Aug 12, 2025Updated 9 months ago
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆15Jun 26, 2023Updated 2 years ago
- Code Repository for my 3rd Data Project.☆16Jun 13, 2023Updated 2 years ago
- SCIM 2.0 JAVA development kit☆19May 2, 2025Updated last year
- Desarrollé un proyecto de ETL sobre archivos de diferentes orígenes (CSV, JSON). Luego, utilicé FastAPI para crear una API que permita re…☆10Dec 9, 2022Updated 3 years ago
- An end-to-end data engineering pipeline to create a dashboard for the latest content on the r/Stocks subreddit☆20Aug 5, 2022Updated 3 years ago
- Ansible role for deploying k3s cluster☆41Feb 12, 2026Updated 3 months ago
- Data Engineering project using Databricks PySpark & Spark SQL for analysing data from Spotify API and present in form of PowerBI report☆51Nov 26, 2025Updated 6 months ago
- This is a demo project to compare two web scrapping frameworks, Playwright and Selenium and using the new Pipelining tool Dagster☆15Sep 9, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Sample project to demonstrate data engineering best practices☆219Feb 24, 2024Updated 2 years ago
- Data engineering project using UK Bus Open Data Service (BODS) to calculate late buses in real-time for any selected region in England. P…☆31Apr 2, 2023Updated 3 years ago
- Twitch Stream Analysis with Apache Spark and Apache Zeppelin☆12Aug 8, 2016Updated 9 years ago
- Social Media Analysis, scalable solution, flexible deployment that analyses social media contents☆10Jul 20, 2023Updated 2 years ago
- ⚡ An Augmented Reality real-world length measuring web application built by the modification of the example being provided by babylonjs -…☆12Sep 24, 2020Updated 5 years ago
- Resources and projects from Udacity Data Engineering with AWS nano degree programme☆29Apr 12, 2023Updated 3 years ago
- Custom golang proxy inspired in nginx proxy, and traefik proxy☆20Nov 17, 2021Updated 4 years ago
- Using Apache Airflow to author, run and monitor complex data pipelines.☆12Oct 24, 2018Updated 7 years ago
- Docktor is a Web App that deploys an easy-to-use kit of analysis and scanning tools.☆14Nov 1, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Some example projects for Data Engineers to build, end-to-end.☆39Nov 8, 2023Updated 2 years ago
- ☆11Dec 12, 2019Updated 6 years ago
- End to end data engineering project☆59Oct 27, 2022Updated 3 years ago
- Houston orchestration API. callhouston.io☆50Jun 16, 2025Updated 11 months ago
- An end-to-end project on customer segmentation☆22Mar 10, 2022Updated 4 years ago
- Repo that will help you explore how to build a hybrid workflow using Apache Airflow and Amazon ECS Anywhere☆11Jul 12, 2022Updated 3 years ago
- Amazon Bedrock AgentCore – Multi Framework Examples☆49Sep 24, 2025Updated 8 months ago