Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark
☆11May 22, 2018Updated 7 years ago
Alternatives and similar repositories for spark-kinesis-redshift
Users that are interested in spark-kinesis-redshift are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🚰 A demo to use a lambda function reading records from a kinesis stream and then putting them into a queue in SQS.☆10Jan 1, 2018Updated 8 years ago
- Docker build for AWS DynamoDB☆14Aug 12, 2018Updated 7 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3☆27Jul 23, 2020Updated 5 years ago
- my favorite project☆17Jul 3, 2023Updated 2 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)☆17Dec 18, 2018Updated 7 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extrac…☆10Jul 12, 2021Updated 4 years ago
- Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.☆21Jan 30, 2019Updated 7 years ago
- Airflow AWS ECR integration☆10Feb 25, 2020Updated 6 years ago
- My dotfiles☆33Jan 23, 2019Updated 7 years ago
- A repo to track data engineering projects☆13Nov 11, 2022Updated 3 years ago
- An implementation of JSON Web Tokens in Python Tornado☆30Mar 11, 2016Updated 10 years ago
- ☆17Nov 12, 2022Updated 3 years ago
- A Golang API skeleton with GraphQL☆43Sep 2, 2020Updated 5 years ago
- Repository for the Document streaming capstone projects☆12Nov 17, 2025Updated 5 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Insight Data Engineering project: A platform built in HDFS, Spark and Airflow to help you to find social influencers from GitHub Net…☆16May 21, 2024Updated last year
- My Vim Setup☆26Feb 24, 2023Updated 3 years ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- ☆10Apr 21, 2021Updated 4 years ago
- A collection of data analysis projects done using PySpark via Jupyter notebooks.☆10Oct 8, 2022Updated 3 years ago
- Mock aws-sdk API methods to enable easier testing of apps which use the AWS SDK for JavaScript☆14Aug 30, 2016Updated 9 years ago
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousin…☆15Apr 29, 2021Updated 4 years ago
- Docker build for AWS Kinesis local☆46Feb 20, 2019Updated 7 years ago
- Rasa Chatbot using Django backend and Sockets for communication☆12Dec 8, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆14Sep 14, 2021Updated 4 years ago
- Copy millions of objects in minutes☆12Oct 21, 2019Updated 6 years ago
- Distributed stock price forecasting system to predict S&P 500 stock prices.☆11Nov 12, 2021Updated 4 years ago
- Repository for Apache Spark course at Team Data Science☆17Oct 23, 2020Updated 5 years ago
- Power Plant ML Pipeline Application - Apache Spark☆12Dec 12, 2016Updated 9 years ago
- ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipelin…☆11Mar 9, 2022Updated 4 years ago
- ☆11Jun 15, 2019Updated 6 years ago
- Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMR☆90Jul 17, 2019Updated 6 years ago
- A data engineering pipeline for digital marketers.☆11Dec 21, 2018Updated 7 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- PredictorFinc is a scalable supervised machine learning model the predicts stock price change through Decision Tree Regressor using data …☆12Sep 5, 2023Updated 2 years ago
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆17Oct 1, 2019Updated 6 years ago
- Labs and demos for courses in the Data Engineer track of GCP Training (http://cloud.google.com/training).☆16Oct 28, 2019Updated 6 years ago
- Building a scalable predictor with akka technologies☆12Apr 1, 2016Updated 10 years ago
- A fast and low memory requirement version of PointHop and PointHop++, which is built upon Apache Spark.☆10Jul 14, 2020Updated 5 years ago
- Tweepy Stream Example☆19Apr 23, 2019Updated 6 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆56May 6, 2023Updated 2 years ago