Insight Data Engineering project: A platform built in HDFS, Spark and Airflow to help you to find social influencers from GitHub Network.
☆16May 21, 2024Updated last year
Alternatives and similar repositories for Git-Influencer
Users that are interested in Git-Influencer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extrac…☆10Jul 12, 2021Updated 4 years ago
- A repo to track data engineering projects☆13Nov 11, 2022Updated 3 years ago
- SEJ Article notebooks☆17Nov 12, 2020Updated 5 years ago
- Project Search is a Recommendation system for Youtube videos and Amazon products.☆12May 10, 2017Updated 8 years ago
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆17Oct 1, 2019Updated 6 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Tweepy Stream Example☆19Apr 23, 2019Updated 6 years ago
- Loan Default Prediction using PySpark, with jobs scheduled by Apache Airflow and Integration with Spark using Apache Livy☆22Dec 26, 2020Updated 5 years ago
- Usage examples for byte-genie API☆12Apr 27, 2024Updated last year
- This is a capstone project that entails building an end-to-end ETL (Extract-Transform-Load) Data pipeline which extracts UK accident and …☆18Jun 6, 2020Updated 5 years ago
- ELT Code for your Data Warehouse☆26Sep 18, 2023Updated 2 years ago
- Demonstration of using Apache Spark to build robust ETL pipelines while taking advantage of open source, general purpose cluster computin…☆25Aug 11, 2023Updated 2 years ago
- A real-time event pipeline around Kafka Ecosystem for Chicago Transit Authority.☆32Aug 14, 2023Updated 2 years ago
- JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.☆31Dec 8, 2022Updated 3 years ago
- Spark data pipeline that processes movie ratings data.☆31Mar 1, 2026Updated 3 weeks ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Interactive Elasticsearch Analyzer☆13Dec 8, 2022Updated 3 years ago
- Find mapcodes in a string☆26Jul 24, 2022Updated 3 years ago
- This repository contains all the example code to help you build a content aggregator like serverless land. It is split into 2 components:…☆40Sep 23, 2025Updated 6 months ago
- COVID-19 Projections Data and Dashboard☆26Dec 8, 2022Updated 3 years ago
- Mconf's wiki: https://github.com/mconf/wiki/wiki☆13Apr 30, 2014Updated 11 years ago
- This is python web scraper implemented using multithreading/multiprocessing/pool for amazon.com☆28Sep 23, 2019Updated 6 years ago
- ☆10Dec 22, 2018Updated 7 years ago
- A simple RFID music player for kids (runs on a Raspberry Pi)☆11Jun 30, 2017Updated 8 years ago
- the full stack☆13Jun 16, 2015Updated 10 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Amazon Keyword Suggestion Tool in GoLang. Tool will generate relevant Amazon Product Keywords with the number of active products per each…☆50Jan 3, 2021Updated 5 years ago
- letter avatar is angular2 directive. It will generate avatar based on given text☆15Oct 31, 2019Updated 6 years ago
- A collection of remark plugins used by HashiCorp to process markdown☆16Aug 22, 2025Updated 7 months ago
- Annotate your pictures online and save in different formats☆13Oct 4, 2023Updated 2 years ago
- Learn how to use Kinesis Firehose, AWS Glue, S3, and Amazon Athena by streaming and analyzing reddit comments in realtime. 100-200 level …☆45Apr 20, 2021Updated 4 years ago
- Python writable in-memory virtual filesystem for SQLite☆17Jan 6, 2024Updated 2 years ago
- This is a version of Li Chen Wang's Palo Alto Tiny BASIC 2.0 for use with the online 8080 emulator and assembler ASM80.com.☆12Oct 10, 2020Updated 5 years ago
- jgtextrank: Yet another Python implementation of TextRank☆13Nov 27, 2019Updated 6 years ago
- 🥪💾 A sample of data from the `jaffle-shop-generator` that powers the Jaffle Shop spanning one year.☆15Jan 23, 2025Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ⚡️ A curated list of awesome things related to Infisical☆23Oct 16, 2023Updated 2 years ago
- Mine Sweeper with Liveview☆17Jun 5, 2023Updated 2 years ago
- Semaphore demo CI/CD pipeline using Docker Compose and Python Flask☆13Jan 26, 2024Updated 2 years ago
- PySpark functions and utilities with examples. Assists ETL process of data modeling☆104Dec 3, 2020Updated 5 years ago
- This repository contains code associated with an AWS a blog which demonstrates how you can accept API keys as a query string parameter in…☆10Feb 18, 2022Updated 4 years ago
- 😈Complete End to End ETL Pipeline with Spark, Airflow, & AWS☆51Aug 23, 2019Updated 6 years ago
- Implements a gateway that speaks the SparkConnect protocol and drives a backend using Substrait (over ADBC Flight SQL).☆20Feb 10, 2025Updated last year