yennanliu/NYC_Taxi_Pipeline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yennanliu/NYC_Taxi_Pipeline)

yennanliu / NYC_Taxi_Pipeline

Stream/batch system with Hadoop, Spark on NYC taxi data | #DE

☆26

Alternatives and similar repositories for NYC_Taxi_Pipeline

Users that are interested in NYC_Taxi_Pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yennanliu / NYC_Taxi_Trip_Duration
View on GitHub
Develop ML models predict taxi trip duration in NYC. Ranked : Top 6% | RMSLE : 0.377 (Kaggle) | #DS
☆17Jan 7, 2023Updated 3 years ago
rss161030 / ETL-processes-using-Sqoop-Hadoop-Hive-Spark-and-Scala
View on GitHub
I implemented various ETL processes like loading the data using sqoop from mysql to hdfs, transform the data using Spark and Scala, perfo…
☆10Oct 20, 2017Updated 8 years ago
ismaildawoodjee / aws-data-pipeline
View on GitHub
A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from loc…
☆24May 14, 2022Updated 4 years ago
cvilla87 / PySpark-ETL-Telecom
View on GitHub
Jupyter Notebook showing how to process Telecom datasets using PySpark (SparkSQL and DataFrames) and plotting the results using Matplotli…
☆17Dec 3, 2018Updated 7 years ago
yennanliu / web_scraping
View on GitHub
Collect/process data via various data sources : website / js website / API. Run scrapping pipeline via Celery, and Travis cron task. Du…
☆15Jul 24, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
rvilla87 / ETL-PySpark
View on GitHub
ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)
☆17Dec 18, 2018Updated 7 years ago
v1zh3d / Vadoadra-House-Price-Prediction
View on GitHub
This is a simple Linear Regression implementation machine learning model and deployment of the same using flask. Data-set of Vadodara Hou…
☆10Jan 8, 2020Updated 6 years ago
alvertogit / bigdata_docker
View on GitHub
Big Data Docker Data Science Spark Spark4 Hadoop HDFS Scala Python Artificial Intelligence Machine Learning Jupyter Lab Notebook
☆19Updated this week
Tirth27 / Real-time-analytics-with-spark-streaming
View on GitHub
This project aims to build a streaming application to perform real-time analytics of Covid-19 related tweets and deploy an ML model for r…
☆14Jul 15, 2021Updated 5 years ago
maniram-yadav / Big_DataHadoop_Projects
View on GitHub
Big data projects implemented by Maniram yadav
☆50May 5, 2018Updated 8 years ago
priye-1 / airflow_data_pipeline
View on GitHub
☆16May 29, 2023Updated 3 years ago
jacob1421 / RustCheatersDataPipeline
View on GitHub
Data pipeline that scrapes Rust cheater Steam profiles
☆53Feb 13, 2022Updated 4 years ago
stephen29xie / tweet-streaming-data-pipeline
View on GitHub
Real-time streaming data pipeline for Twitter Tweets
☆10Jan 31, 2022Updated 4 years ago
aravindr18 / RedditR--Insight-Data-Engineering-Project
View on GitHub
RedditR for Content Engagement and Recommendation
☆18Dec 21, 2017Updated 8 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
PritomDas / Real-Time-Streaming-Data-Pipeline-and-Dashboard
View on GitHub
Building Real Time Data Pipeline using Apache Kafka, Apache Spark, Hadoop, PostgreSQL, Django and Flexomonster on Docker to track status …
☆24Dec 29, 2020Updated 5 years ago
woonsangcho / contrast_qgen
View on GitHub
Code for 'Contrastive Multi-Document Question Generation'
☆11Oct 16, 2022Updated 3 years ago
starfleetjames / MinXSS_Beacon_Decoder
View on GitHub
Beacon decoder for the MinXSS CubeSat in space; MinXSS-2 launch on 2018-11-19
☆12May 3, 2019Updated 7 years ago
MaxPoon / Leetcode
View on GitHub
My solutions to the algorithm questions on leetcode.
☆14May 9, 2019Updated 7 years ago
saboye / Data-Modeling-with-Postgres
View on GitHub
A project to design a fact and dimension star schema for optimizing queries on a flight booking database using PostgreSQL, a relational d…
☆12Aug 15, 2021Updated 4 years ago
Datananas / quill-placeholder-autocomplete
View on GitHub
brings autocomplete to Quill Placeholder module
☆12Sep 28, 2018Updated 7 years ago
Siddhartha80 / AI-Powered-Predictive-Maintenance-System-for-Vehicles-with-Real-Time-Data-Visualization-and-Analysis
View on GitHub
Gradient Boosting Models on Real-Time Sensor Data for AI-Enhanced Vehicle Predictive Maintenance. By using a web-based interface to forec…
☆19Nov 17, 2024Updated last year
priye-1 / Real_time_End_to_End_Pipeline_using_Kafka
View on GitHub
☆19May 27, 2023Updated 3 years ago
SourabhSinghRana / real-time_crypto_data_pipeline_using_kafka
View on GitHub
I am using confluent Kafka cluster to produce and consume scraped data. In this project, I've created a real-time data pipeline that uti…
☆29May 2, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
TanuAgrawal123 / 100DaysOfCode
View on GitHub
Round 1 of 100 days of code in order to learn "Python with Django Framework"
☆11Oct 18, 2021Updated 4 years ago
UCREL / science_parse_py_api
View on GitHub
Python API for Science Parse
☆13Mar 27, 2021Updated 5 years ago
thanh-abaii / groq-deep-researcher
View on GitHub
☆12Feb 3, 2025Updated last year
ArasLabs / pwa-sample-app
View on GitHub
This project contains a sample Progressive Web App (PWA) that connects to Aras Innovator via RESTful API and OAuth authentication.
☆10Oct 23, 2020Updated 5 years ago
devery / devery_contracts
View on GitHub
Devery Protocol Smart Contracts
☆12Dec 22, 2018Updated 7 years ago
ArpiteshSrivastava / spotify-data-engineering-project
View on GitHub
In this project, we will build and ETL(Extract,Transform,Load) pipeline using the Spotify API on AWS. The pipeline will retrieve data fro…
☆25May 6, 2023Updated 3 years ago
katiehuangx / Serious-SQL-Apprenticeship
View on GitHub
Serious SQL is a Data With Danny virtual data apprenticeship program.
☆22Sep 3, 2021Updated 4 years ago
stream-processing-with-spark / notebooks
View on GitHub
Interactive Notebooks that support the book
☆40Nov 5, 2020Updated 5 years ago
IncredibleWeb / React-PWA-Boilerplate
View on GitHub
A simple boilerplate for a universal progressive web application including some default components built using React + Redux
☆13May 2, 2018Updated 8 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Dineshkarthik / real-time-IoT-data-streaming
View on GitHub
Realtime IoT data streaming from Smartphone sensors
☆11Aug 26, 2020Updated 5 years ago
RoyMachineLearning / IEEE-CIS-Fraud-Detection
View on GitHub
Kaggle Competition : IEEE-CIS-Fraud-Detection
☆10Jan 18, 2020Updated 6 years ago
synw / fluxmap
View on GitHub
A reactive map that handle real time location updates for multiple devices for Flutter
☆12Mar 28, 2020Updated 6 years ago
protonx-ai-devs-02 / cuongtm-vietnamese-rag-chatbot
View on GitHub
Create a chatbot that provides responses in Vietnamese, focusing on the products offered by a flower shop
☆11Nov 14, 2024Updated last year
she-who-codes / flutter_celebration
View on GitHub
Celebrating 60k followers on Twitter using Flutter and Flare!
☆13Sep 17, 2019Updated 6 years ago
rodydavis / fb_firestore
View on GitHub
Firebase Firestore on Web, Mobile and Desktop
☆12Sep 27, 2021Updated 4 years ago
rohan20 / flutter-hummingbird-animation
View on GitHub
First attempt at using Flare from 2Dimensions
☆16Dec 29, 2018Updated 7 years ago