justinrmiller/spark-kafka-parquet-example

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/justinrmiller/spark-kafka-parquet-example)

justinrmiller / spark-kafka-parquet-example

An example project that combines Spark Streaming, Kafka, and Parquet to transform JSON objects streamed over Kafka into Parquet files in S3.

☆19

Alternatives and similar repositories for spark-kafka-parquet-example

Users that are interested in spark-kafka-parquet-example are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

endymecy / AlgorithmsOnSpark
View on GitHub
Some popular algorithms(dbscan,knn,fm etc.) on spark
☆32May 29, 2018Updated 8 years ago
neoremind / app-on-yarn-demo
View on GitHub
Demo for service oriented application hosted on Hadoop YARN cluster for HA and scheduling
☆23Apr 2, 2018Updated 8 years ago
devmindset / sparkscalainterview
View on GitHub
Contain Interview Questions Solutions
☆12May 18, 2018Updated 8 years ago
seanpquig / confluent-platform-spark-streaming
View on GitHub
Working example of consuming Avro data from Kafka with Spark Streaming
☆12Feb 21, 2016Updated 10 years ago
simplesourcing / simplesource-examples
View on GitHub
Simple Sourcing example applications
☆13Dec 8, 2022Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
vineetpandey / HackerRank---The-Linux-Shell-Problems_Solutions
View on GitHub
Problems can be found over - https://www.hackerrank.com/domains/shell/bash/
☆13Jan 20, 2015Updated 11 years ago
stdatalabs / aadhaar-dataset-analysis
View on GitHub
An analysis on Aadhaar dataset using Mapreduce and Spark
☆14Feb 28, 2018Updated 8 years ago
superaghu / LeetCodeLocally
View on GitHub
☆13Oct 16, 2020Updated 5 years ago
caroljmcdonald / sparkgraphxexample
View on GitHub
graphx example
☆24Jan 23, 2016Updated 10 years ago
mdrakiburrahman / azure-databricks-malware-prediction
View on GitHub
End-to-end Machine Learning Pipeline demo using Delta Lake, MLflow and AzureML in Azure Databricks
☆18Nov 9, 2019Updated 6 years ago
mdrakiburrahman / databricks-certification
View on GitHub
My Study guide used to pass the CRT020 Spark Certification exam
☆34Jan 6, 2020Updated 6 years ago
ayjindal / SOLID
View on GitHub
Examples and exercises for object-oriented design principles
☆18Jun 29, 2021Updated 5 years ago
Dax1n / flinkdevelop
View on GitHub
Apache Flink 学习的Demo
☆10Jun 21, 2017Updated 9 years ago
krallistic / druid-kubernetes
View on GitHub
☆35Dec 2, 2016Updated 9 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ansrivas / spark-structured-streaming
View on GitHub
Spark structured streaming with Kafka data source and writing to Cassandra
☆62Dec 5, 2019Updated 6 years ago
bartosz25 / spark-scala-playground
View on GitHub
Sample processing code using Spark 2.1+ and Scala
☆51Jun 28, 2020Updated 6 years ago
yintaoxue / solr-ref-guide-zh
View on GitHub
Apache Solr 官方参考手册
☆14Sep 16, 2015Updated 10 years ago
one-leaf / tensorflow
View on GitHub
一些机器学习的实践
☆11Jun 29, 2022Updated 4 years ago
palantir / spark-influx-sink
View on GitHub
A Spark metrics sink that pushes to InfluxDb
☆51Jan 14, 2021Updated 5 years ago
Angel-ML / sona
View on GitHub
Spark On Angel, arming Spark with a powerful Parameter Server, which enable Spark to train very big models
☆85Jan 2, 2023Updated 3 years ago
alibaba-archive / aliyun-oss-hadoop-fs
View on GitHub
Hadoop filesystem implementation for Aliyun OSS
☆13Feb 14, 2016Updated 10 years ago
caroljmcdonald / mapr-sparkml-streaming-uber
View on GitHub
☆20Feb 28, 2018Updated 8 years ago
xcodebuild / localapp
View on GitHub
Rust CLI to convert webpage into desktop app with tauri under 3 MB
☆13Jun 16, 2022Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
phatak-dev / flink-examples
View on GitHub
Flink Examples
☆39Apr 27, 2016Updated 10 years ago
leno1001 / spark_monitor
View on GitHub
请求spark rest API获取applications，jobs，stages，executors，rdds，streaming，environment等信息提供监控和报警服务
☆11Nov 22, 2018Updated 7 years ago
mkt-Do / replicated-clickhouse
View on GitHub
☆10Feb 12, 2020Updated 6 years ago
HeartSaVioR / iot-trucking-app-flink
View on GitHub
IoT Trucking App with Flink (with Table API & SQL)
☆14Jul 4, 2018Updated 8 years ago
mapr-demos / SparkStreamingHBaseExample
View on GitHub
Spark Streaming HBase Example
☆22May 20, 2026Updated 2 months ago
jerryygit / ZRDSL
View on GitHub
json或SQL语言转为flink或者spark流/批任务
☆12Jun 21, 2022Updated 4 years ago
axetroy / wxapp-dev-tool-for-linux
View on GitHub
linux版的微信小程序开发工具. 源码与官方一致
☆16May 9, 2017Updated 9 years ago
kinoplan / utils
View on GitHub
A set of tools that make working with the Scala ecosystem even better.
☆13Jul 21, 2026Updated last week
ELC / cookiecutter-python-fullstack
View on GitHub
Generate a Full Stack Python Web App - Choose the framework you want Vue, React, Angular - Can be run in a single container or without Do…
☆13Aug 22, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
rohgar / scala-parallel-programming-3
View on GitHub
☆21Feb 9, 2017Updated 9 years ago
zhangjr-gaoyou / spring-boot-spark-demo
View on GitHub
使用spring-boot-spark的一个样例
☆11Aug 3, 2018Updated 7 years ago
cmoore-sp / plsql-markdown-2-html
View on GitHub
PLSQL Package converting Markdown to HTML
☆13May 5, 2017Updated 9 years ago
xuchunyang / one.el
View on GitHub
Take a peek at HN/知乎日报/V2EX/SBBS within Emacs
☆13Jun 7, 2015Updated 11 years ago
lucasbak / kafka-spark-streaming
View on GitHub
Project for reading data from kafka and writing to kafka and HBase with kerberos
☆24Dec 8, 2016Updated 9 years ago
turinglabsorg / ipdb
View on GitHub
Interplanetary Database: A Database built on top of IPFS and made immutable using Ethereum blockchain.
☆10Sep 19, 2022Updated 3 years ago
datahappy1 / csv_to_parquet_converter
View on GitHub
csv to parquet and vice versa file converter based on Pandas written in Python3
☆10Mar 23, 2021Updated 5 years ago