Bucketing and partitioning system for Parquet
☆30May 22, 2018Updated 7 years ago
Alternatives and similar repositories for pucket
Users that are interested in pucket are comparing it to the libraries listed below
Sorting:
- gitkv is a server for using git as a key value store for text files☆14Jul 17, 2023Updated 2 years ago
- Html Content / Article Extractor in Scala☆18May 23, 2018Updated 7 years ago
- Set of micro libraries in scala.☆12Dec 18, 2024Updated last year
- 🥧 Generates classic British dishes☆13Dec 27, 2022Updated 3 years ago
- Temporal Random Indexing☆14Oct 3, 2024Updated last year
- Hadoop InputFormat for http://druid.io/☆10Oct 26, 2016Updated 9 years ago
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Aug 17, 2015Updated 10 years ago
- A web-latency SQL spout for Hadoop.☆50Jan 25, 2021Updated 5 years ago
- A NiFi client library for JVM languages☆13Mar 18, 2016Updated 9 years ago
- ☆17Jan 25, 2017Updated 9 years ago
- A way to convert Thrift Services and Functions into Human Readable JSON☆18Feb 4, 2018Updated 8 years ago
- ☆19Sep 8, 2017Updated 8 years ago
- ☆21Mar 17, 2023Updated 2 years ago
- scripts to quickly measure system baseline performance☆23Aug 1, 2025Updated 7 months ago
- Spark data profiling utilities☆23Nov 24, 2018Updated 7 years ago
- Hadoop Profiler, or hprofiler, is a tool which is able to analyze on- and off-CPU workloads on distributed computing environments.☆24Jul 7, 2016Updated 9 years ago
- Spark DataFrames for earth observation data☆19May 1, 2018Updated 7 years ago
- Example code for building your own MemSQL Streamliner Pipelines☆23Apr 18, 2017Updated 8 years ago
- Swagger Support For Finatra☆29Dec 20, 2024Updated last year
- Spark job to perform massive Point in Polygon (PiP) operations☆32Mar 19, 2017Updated 8 years ago
- A comprehensive Message Control Protocol (MCP) server for Kafka Schema Registry.☆31Updated this week
- Cybersecurity Psychology Framework☆19Feb 22, 2026Updated last week
- Streaming Analytics platform, built with Apache Flink and Kafka☆36Oct 6, 2023Updated 2 years ago
- Ambari service for Apache Zeppelin notebook☆70Aug 29, 2017Updated 8 years ago
- Kubernetes Operator for the Ververica Platform☆35Jan 19, 2023Updated 3 years ago
- dregex is a Java library that implements a regular expression engine using deterministic finite automata (DFA). It supports some Perl-sty…☆49Feb 16, 2026Updated last week
- JSAI2019でのチュートリアル講演 「オントロジー工学に基づくセマンティック技術」の資料公開用☆12Jun 7, 2019Updated 6 years ago
- A self-contained morphological analyzer (including dictionary data).☆33Jul 30, 2015Updated 10 years ago
- Code demos for _Intro to Shiny_, a video course with O'Reilly Media.☆11Sep 20, 2016Updated 9 years ago
- ☆13Aug 21, 2021Updated 4 years ago
- Package provides java implementation of the latent dirichlet allocation (LDA) for topic modelling☆10May 18, 2017Updated 8 years ago
- Relational database data generator..☆40Dec 23, 2022Updated 3 years ago
- ☆13Aug 11, 2025Updated 6 months ago
- A TensorFlow 2.0 .whl file compiled with an old processor/computer☆11Dec 12, 2020Updated 5 years ago
- phData Pulse application log aggregation and monitoring☆13Apr 13, 2020Updated 5 years ago
- Exercises for Functional Programming in Scala☆41Oct 18, 2020Updated 5 years ago
- Build, configure, and track workflows with Jarvis.☆14Apr 17, 2018Updated 7 years ago
- Spark Custome Stream Source and Sink☆12Jan 19, 2019Updated 7 years ago
- Java code for Apache Nifi processors☆11Jun 5, 2017Updated 8 years ago