Spark stream from kafka(json) to s3(parquet)
☆15Nov 8, 2018Updated 7 years ago
Alternatives and similar repositories for jaquet
Users that are interested in jaquet are comparing it to the libraries listed below
Sorting:
- spark自学手册,包含了例如spark core、spark sql、spark streaming、spark-kafka、delta-lake,以及scala基础练习,还有一些例如master、shuffle源码分析,总结及翻译。☆18Jul 19, 2023Updated 2 years ago
- How to use Parquet in Flink☆32May 2, 2017Updated 8 years ago
- I'll munch some data here☆12Jun 18, 2021Updated 4 years ago
- ☆10Feb 12, 2020Updated 6 years ago
- 基于FastAPI + LangChain + OpenAI API + Vue的AI表格处理工具,用于智能化处理和分析表格数据。☆17Jul 14, 2025Updated 7 months ago
- ☆11Mar 27, 2024Updated last year
- A Scala library for locality sensitive hashing☆14Aug 1, 2018Updated 7 years ago
- json或SQL语言转为flink或者spark流/批任务☆12Jun 21, 2022Updated 3 years ago
- 请求spark rest API获取applications,jobs,stages,executors,rdds,streaming,environment等信息提供监控和报警服务☆11Nov 22, 2018Updated 7 years ago
- Unix tee, but for Kinesis streams☆12Oct 19, 2021Updated 4 years ago
- A holding place for examples/proofs of concepts/etc for static site generators.☆12Feb 24, 2017Updated 9 years ago
- Fundamentos de Big data com Apache Hadoop☆13Jul 1, 2022Updated 3 years ago
- Translation of the QuickCheck properties in the paper "How to specify it!" by John Hughes into clojure test.check☆10Jul 19, 2019Updated 6 years ago
- Movie Recommendation System Using Spark ML, Akka and Cassandra☆12Oct 4, 2019Updated 6 years ago
- A Python + NLTK Text Mining Open Course // Curso aberto se utilizando de Python + NLTK para Mineração Textual☆11Feb 3, 2018Updated 8 years ago
- A tutorial that explains how to build a simple distributed fault-tolerant framework on top of Mesos☆47Oct 4, 2022Updated 3 years ago
- Code and architecture diagrams for performance testing a few API approaches on AWS☆10Apr 20, 2019Updated 6 years ago
- Skeleton project for Apache Airflow training participants to work on.☆17Jul 9, 2020Updated 5 years ago
- Apache Flink 学习的Demo☆10Jun 21, 2017Updated 8 years ago
- ☆12Mar 15, 2022Updated 3 years ago
- Dataset for binary classification☆11Oct 24, 2015Updated 10 years ago
- IoT Trucking App with Flink (with Table API & SQL)☆14Jul 4, 2018Updated 7 years ago
- ☆18Sep 7, 2014Updated 11 years ago
- A-Frame pipe for interop with Angular☆11Apr 18, 2018Updated 7 years ago
- A minimal Apache Hive server in a Docker image☆13Dec 24, 2020Updated 5 years ago
- Simulation of job offers and CVs with real-time processing, classification, and analytics using Kafka, Ray, Spark, and Databricks. Includ…☆14Dec 25, 2024Updated last year
- Sanic application fully integrated with Motor + UMongo☆10Aug 6, 2022Updated 3 years ago
- ☆14Nov 3, 2016Updated 9 years ago
- IntelliJ IDEA in a Docker container☆11Dec 9, 2014Updated 11 years ago
- 使用spring-boot-spark的一个样例☆11Aug 3, 2018Updated 7 years ago
- My HackerRank Solutions : https://www.hackerrank.com/RohanKhude☆12Jul 13, 2016Updated 9 years ago
- Workshop for Spark and Databricks☆54Dec 6, 2019Updated 6 years ago
- Simple riemann query tool written in Go.☆21Dec 2, 2016Updated 9 years ago
- UDF, GenericUDF, UDTF, UDAF☆12Jul 1, 2022Updated 3 years ago
- MOAI, an Open Access Server Platform for Institutional Repositories☆15Apr 21, 2023Updated 2 years ago
- 读书笔记|stream processing with apache flink|统计学习方法☆12May 29, 2020Updated 5 years ago
- Hexagonal Binning for Qlik Sense, based on hexbin.js☆12Jun 16, 2016Updated 9 years ago
- A Spark datasource for the HadoopCryptoLedger library☆13Sep 29, 2025Updated 5 months ago
- Helium hotspot stats & leaderboards for your Discord server☆13Jan 4, 2022Updated 4 years ago