An example project that combines Spark Streaming, Kafka, and Parquet to transform JSON objects streamed over Kafka into Parquet files in S3.
☆19Jun 22, 2021Updated 4 years ago
Alternatives and similar repositories for spark-kafka-parquet-example
Users that are interested in spark-kafka-parquet-example are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Nov 3, 2016Updated 9 years ago
- Some popular algorithms(dbscan,knn,fm etc.) on spark☆32May 29, 2018Updated 8 years ago
- Play-ParSeq is a Play module which seamlessly integrates ParSeq with Play Framework☆17May 20, 2023Updated 3 years ago
- A search index specialised for LaTeX equations. Developed for latexsearch.com.☆17Jul 15, 2011Updated 14 years ago
- An example project using Spark Streaming with Kafka message and Avro serialization☆12Aug 21, 2015Updated 10 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Demo for service oriented application hosted on Hadoop YARN cluster for HA and scheduling☆23Apr 2, 2018Updated 8 years ago
- Kafka delivery semantics in the case of failure depend on how and when offsets are stored. Spark output operations are at-least-once. So …☆37Apr 19, 2017Updated 9 years ago
- Problems can be found over - https://www.hackerrank.com/domains/shell/bash/☆13Jan 20, 2015Updated 11 years ago
- Working example of consuming Avro data from Kafka with Spark Streaming☆12Feb 21, 2016Updated 10 years ago
- An analysis on Aadhaar dataset using Mapreduce and Spark☆14Feb 28, 2018Updated 8 years ago
- These are a select few projects related to Big Data Analytics and Management. The projects listed are a combination of both small and big…☆11Oct 11, 2019Updated 6 years ago
- Code for Springer Book: High Performance Distributed Computing: Case Studies with Hadoop, Scalding and Spark☆15Oct 6, 2017Updated 8 years ago
- ☆13Oct 16, 2020Updated 5 years ago
- graphx example☆24Jan 23, 2016Updated 10 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- End-to-end Machine Learning Pipeline demo using Delta Lake, MLflow and AzureML in Azure Databricks☆18Nov 9, 2019Updated 6 years ago
- Examples and exercises for object-oriented design principles☆18Jun 29, 2021Updated 4 years ago
- Example of running the flume log4j appender using CDH4 Flume☆15Jan 17, 2013Updated 13 years ago
- Reusable code for Hive☆16Aug 19, 2014Updated 11 years ago
- Hive Web Interface☆30Apr 29, 2014Updated 12 years ago
- Sample processing code using Spark 2.1+ and Scala☆51Jun 28, 2020Updated 5 years ago
- Tool to dump all GPS traces collected by/for the OpenStreetMap project.☆25Mar 6, 2019Updated 7 years ago
- Apache Solr 官方参考手册☆14Sep 16, 2015Updated 10 years ago
- A Spark metrics sink that pushes to InfluxDb☆51Jan 14, 2021Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Spark structured streaming with Kafka data source and writing to Cassandra