Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark
β11May 22, 2018Updated 7 years ago
Alternatives and similar repositories for spark-kinesis-redshift
Users that are interested in spark-kinesis-redshift are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π° A demo to use a lambda function reading records from a kinesis stream and then putting them into a queue in SQS.β10Jan 1, 2018Updated 8 years ago
- Docker build for AWS DynamoDBβ14Aug 12, 2018Updated 7 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3β28Jul 23, 2020Updated 5 years ago
- my favorite projectβ17Jul 3, 2023Updated 2 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)β17Dec 18, 2018Updated 7 years ago
- Open source password manager - Proton Pass β’ AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extracβ¦β10Jul 12, 2021Updated 4 years ago
- Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.β21Jan 30, 2019Updated 7 years ago
- Airflow AWS ECR integrationβ10Feb 25, 2020Updated 6 years ago
- My dotfilesβ33Jan 23, 2019Updated 7 years ago
- A repo to track data engineering projectsβ13Nov 11, 2022Updated 3 years ago
- An implementation of JSON Web Tokens in Python Tornadoβ30Mar 11, 2016Updated 10 years ago
- β17Nov 12, 2022Updated 3 years ago
- A Golang API skeleton with GraphQLβ43Sep 2, 2020Updated 5 years ago
- Repository for the Document streaming capstone projectsβ12Nov 17, 2025Updated 4 months ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Insight Data Engineering project: A platform built in HDFS, Spark and Airflow to help you to find social influencers from GitHub Netβ¦β16May 21, 2024Updated last year
- My Vim Setupβ26Feb 24, 2023Updated 3 years ago
- Local Development of AWS Glue with Docker and Visual Studio Codeβ14Nov 29, 2021Updated 4 years ago
- β10Apr 21, 2021Updated 4 years ago
- A collection of data analysis projects done using PySpark via Jupyter notebooks.β10Oct 8, 2022Updated 3 years ago
- Mock aws-sdk API methods to enable easier testing of apps which use the AWS SDK for JavaScriptβ14Aug 30, 2016Updated 9 years ago
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousinβ¦β15Apr 29, 2021Updated 4 years ago
- Rasa Chatbot using Django backend and Sockets for communicationβ12Dec 8, 2022Updated 3 years ago
- Docker build for AWS Kinesis localβ46Feb 20, 2019Updated 7 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- β14Sep 14, 2021Updated 4 years ago
- Copy millions of objects in minutesβ12Oct 21, 2019Updated 6 years ago
- Distributed stock price forecasting system to predict S&P 500 stock prices.β11Nov 12, 2021Updated 4 years ago
- Power Plant ML Pipeline Application - Apache Sparkβ12Dec 12, 2016Updated 9 years ago
- ecommerce GCP Streaming pipeline β Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipelinβ¦β11Mar 9, 2022Updated 4 years ago
- Repository for Apache Spark course at Team Data Scienceβ17Oct 23, 2020Updated 5 years ago
- β11Jun 15, 2019Updated 6 years ago
- Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMRβ89Jul 17, 2019Updated 6 years ago
- PredictorFinc is a scalable supervised machine learning model the predicts stock price change through Decision Tree Regressor using data β¦β12Sep 5, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as β¦β17Oct 1, 2019Updated 6 years ago
- Labs and demos for courses in the Data Engineer track of GCP Training (http://cloud.google.com/training).β16Oct 28, 2019Updated 6 years ago
- Building a scalable predictor with akka technologiesβ12Apr 1, 2016Updated 9 years ago
- A fast and low memory requirement version of PointHop and PointHop++, which is built upon Apache Spark.β10Jul 14, 2020Updated 5 years ago
- Tweepy Stream Exampleβ19Apr 23, 2019Updated 6 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatioβ¦β56May 6, 2023Updated 2 years ago
- PySpark functions and utilities with examples. Assists ETL process of data modelingβ104Dec 3, 2020Updated 5 years ago