Insight Data Engineering project: A platform built in HDFS, Spark and Airflow to help you to find social influencers from GitHub Network.
☆16May 21, 2024Updated 2 years ago
Alternatives and similar repositories for Git-Influencer
Users that are interested in Git-Influencer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!☆14Sep 12, 2021Updated 4 years ago
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousin…☆15Apr 29, 2021Updated 5 years ago
- SEJ Article notebooks☆16Nov 12, 2020Updated 5 years ago
- Project Search is a Recommendation system for Youtube videos and Amazon products.☆12May 10, 2017Updated 9 years ago
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆17Oct 1, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Loan Default Prediction using PySpark, with jobs scheduled by Apache Airflow and Integration with Spark using Apache Livy☆22Dec 26, 2020Updated 5 years ago
- Use a AWS Glue Python Shell Job to connect to your Amazon Redshift cluster and execute a SQL script stored in Amazon S3.☆21Aug 8, 2022Updated 3 years ago
- This is a capstone project that entails building an end-to-end ETL (Extract-Transform-Load) Data pipeline which extracts UK accident and …☆18Jun 6, 2020Updated 6 years ago
- A production-grade data pipeline has been designed to automate the parsing of user search patterns to analyze user engagement. Extract d…☆24Nov 22, 2021Updated 4 years ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Aug 14, 2023Updated 2 years ago
- Demonstration of using Apache Spark to build robust ETL pipelines while taking advantage of open source, general purpose cluster computin…☆24Aug 11, 2023Updated 2 years ago
- I am using confluent Kafka cluster to produce and consume scraped data. In this project, I've created a real-time data pipeline that uti…☆29May 2, 2023Updated 3 years ago
- JobAnalytics system consumes data from multiple sources and provides valuable information to both job hunters and recruiters.☆30Dec 8, 2022Updated 3 years ago
- Spark data pipeline that processes movie ratings data.☆31May 1, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Final and skeleton code for the clothing similarity walkthrough☆10Jan 20, 2016Updated 10 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆29Aug 8, 2020Updated 5 years ago
- Interactive Elasticsearch Analyzer☆13Dec 8, 2022Updated 3 years ago
- Find mapcodes in a string☆25Jul 24, 2022Updated 3 years ago
- This repository contains all the example code to help you build a content aggregator like serverless land. It is split into 2 components:…☆39Sep 23, 2025Updated 8 months ago
- COVID-19 Projections Data and Dashboard☆26Dec 8, 2022Updated 3 years ago
- Mconf's wiki: https://github.com/mconf/wiki/wiki☆13Apr 30, 2014Updated 12 years ago
- Project files for the post: Running PySpark Applications on Amazon EMR using Apache Airflow: Using the new Amazon Managed Workflows for A…☆41Jul 6, 2022Updated 3 years ago
- This is python web scraper implemented using multithreading/multiprocessing/pool for amazon.com☆28Sep 23, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A libcluster strategy for Digital Ocean Droplets☆12May 11, 2023Updated 3 years ago
- A simple RFID music player for kids (runs on a Raspberry Pi)☆11Jun 30, 2017Updated 8 years ago
- general-purpose fast, stateless, and deterministic feature extractor written in golang for use in machine learning☆12Mar 17, 2018Updated 8 years ago
- An implementation of the QUIC protocol in Elixir☆13Mar 17, 2019Updated 7 years ago
- Amazon Keyword Suggestion Tool in GoLang. Tool will generate relevant Amazon Product Keywords with the number of active products per each…☆50Jan 3, 2021Updated 5 years ago
- letter avatar is angular2 directive. It will generate avatar based on given text☆15Oct 31, 2019Updated 6 years ago
- A Yeoman generator for creating a FeathersJS plugin.☆22Aug 16, 2021Updated 4 years ago
- Pure Elixir implementation of Sha3 and the original Keccak1600-f☆16Jan 20, 2026Updated 4 months ago
- A collection of remark plugins used by HashiCorp to process markdown☆16Aug 22, 2025Updated 9 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Resources, notebooks, assets for ML for Everyone Twitch stream☆14Jul 8, 2020Updated 5 years ago
- Learn how to use Kinesis Firehose, AWS Glue, S3, and Amazon Athena by streaming and analyzing reddit comments in realtime. 100-200 level …☆45Apr 20, 2021Updated 5 years ago
- Jupyter notebook + Code for scraping AngelList data and making an interactive chart of SFBA salaries/equity☆14Jun 1, 2016Updated 10 years ago
- ✋ Stop propagation for everyday events with Angular directives 🎩☆13Feb 4, 2018Updated 8 years ago
- Collection of Jupyter Notebooks in Python to monitor and improve your Watson Assistant workspaces☆10Jul 17, 2019Updated 6 years ago
- Material usado durante el curso "Introducción al Procesamiento Natural con Python" del Grupo de Ingeniería Lingüistica de la UNAM.☆17Apr 11, 2022Updated 4 years ago
- A small Python library for validating data with pandas☆21Jun 13, 2019Updated 7 years ago