Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark
β11May 22, 2018Updated 7 years ago
Alternatives and similar repositories for spark-kinesis-redshift
Users that are interested in spark-kinesis-redshift are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- π° A demo to use a lambda function reading records from a kinesis stream and then putting them into a queue in SQS.β10Jan 1, 2018Updated 8 years ago
- Docker build for AWS DynamoDBβ14Aug 12, 2018Updated 7 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3β27Jul 23, 2020Updated 5 years ago
- my favorite projectβ17Jul 3, 2023Updated 2 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)β17Dec 18, 2018Updated 7 years ago
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extracβ¦β10Jul 12, 2021Updated 4 years ago
- Spark Application for analysis of Apache Access logs and detect anamolies! Along with Medium Article.β21Jan 30, 2019Updated 7 years ago
- Airflow AWS ECR integrationβ10Feb 25, 2020Updated 6 years ago
- My dotfilesβ33Jan 23, 2019Updated 7 years ago
- A repo to track data engineering projectsβ13Nov 11, 2022Updated 3 years ago
- An implementation of JSON Web Tokens in Python Tornadoβ30Mar 11, 2016Updated 10 years ago
- β17Nov 12, 2022Updated 3 years ago
- A Golang API skeleton with GraphQLβ43Sep 2, 2020Updated 5 years ago
- Repository for the Document streaming capstone projectsβ12Nov 17, 2025Updated 5 months ago
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Insight Data Engineering project: A platform built in HDFS, Spark and Airflow to help you to find social influencers from GitHub Netβ¦β16May 21, 2024Updated last year
- My Vim Setupβ26Feb 24, 2023Updated 3 years ago
- Local Development of AWS Glue with Docker and Visual Studio Codeβ14Nov 29, 2021Updated 4 years ago
- β10Apr 21, 2021Updated 4 years ago
- A collection of data analysis projects done using PySpark via Jupyter notebooks.β10Oct 8, 2022Updated 3 years ago
- Mock aws-sdk API methods to enable easier testing of apps which use the AWS SDK for JavaScriptβ14Aug 30, 2016Updated 9 years ago
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousinβ¦β15Apr 29, 2021Updated 4 years ago
- Docker build for AWS Kinesis localβ46Feb 20, 2019Updated 7 years ago
- Rasa Chatbot using Django backend and Sockets for communicationβ12Dec 8, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- β14Sep 14, 2021Updated 4 years ago
- Copy millions of objects in minutesβ12Oct 21, 2019Updated 6 years ago
- Distributed stock price forecasting system to predict S&P 500 stock prices.β11Nov 12, 2021Updated 4 years ago
- Repository for Apache Spark course at Team Data Scienceβ17Oct 23, 2020Updated 5 years ago
- Power Plant ML Pipeline Application - Apache Sparkβ12Dec 12, 2016Updated 9 years ago
- ecommerce GCP Streaming pipeline β Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipelinβ¦β11Mar 9, 2022Updated 4 years ago
- β11Jun 15, 2019Updated 6 years ago
- Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMRβ91Jul 17, 2019Updated 6 years ago
- A data engineering pipeline for digital marketers.β11Dec 21, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- PredictorFinc is a scalable supervised machine learning model the predicts stock price change through Decision Tree Regressor using data β¦β12Sep 5, 2023Updated 2 years ago
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as β¦β17Oct 1, 2019Updated 6 years ago
- Labs and demos for courses in the Data Engineer track of GCP Training (http://cloud.google.com/training).β16Oct 28, 2019Updated 6 years ago
- Building a scalable predictor with akka technologiesβ12Apr 1, 2016Updated 10 years ago
- A fast and low memory requirement version of PointHop and PointHop++, which is built upon Apache Spark.β10Jul 14, 2020Updated 5 years ago
- Tweepy Stream Exampleβ19Apr 23, 2019Updated 6 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatioβ¦β56May 6, 2023Updated 2 years ago