A software library of stochastic streaming algorithms, a.k.a. sketches.
☆105Jan 20, 2026Updated last month
Alternatives and similar repositories for datasketches
Users that are interested in datasketches are comparing it to the libraries listed below
Sorting:
- Core C++ Sketch Library☆252Feb 15, 2026Updated last week
- A software library of stochastic streaming algorithms, a.k.a. sketches.☆947Feb 18, 2026Updated last week
- Apache Pinot Golang Client managed by StarTree☆33Feb 21, 2026Updated last week
- Website for DataSketches.☆108Feb 20, 2026Updated last week
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Jan 21, 2020Updated 6 years ago
- Management and automation platform for Stateful Distributed Systems☆111Updated this week
- Tool to identify domains containing Pinyin language☆12Oct 18, 2014Updated 11 years ago
- A testing framework for Trino☆27Mar 19, 2025Updated 11 months ago
- A Persistent Key-Value Store designed for Streaming processing☆120Jan 13, 2026Updated last month
- High performance native memory access for Java.☆128Feb 20, 2026Updated last week
- Yuvi is an in-memory storage engine for recent time series metrics data.☆48Dec 12, 2017Updated 8 years ago
- A time series database prototype with multiple backends☆23Feb 13, 2020Updated 6 years ago
- Automatically exported from code.google.com/p/selenium-profiler☆10Mar 16, 2015Updated 10 years ago
- HDFS based on Java implementation as a remote ObjectStore for DataFusion☆10Feb 13, 2024Updated 2 years ago
- A collection of utilities for working with Druid queries☆23Nov 27, 2025Updated 3 months ago
- VectorDB is a free analytics DBMS for IoT & Big Data, compatible with ClickHouse.☆68Oct 16, 2021Updated 4 years ago
- ☆11Oct 5, 2022Updated 3 years ago
- Implements several concurrent data structures as well as some other thread-safe constructs. Supported targets include Neko and CPP so far…☆31Apr 2, 2015Updated 10 years ago
- ☆10Jul 9, 2023Updated 2 years ago
- Clink is a library that provides APIs and infrastructure to facilitate the development of parallelizable feature engineering operators th…☆30Feb 21, 2022Updated 4 years ago
- RemoteStorageManager for Apache Kafka® Tiered Storage☆223Updated this week
- This project enables you to use spring inside of a spark application.☆11May 6, 2015Updated 10 years ago
- Uniffle is a high performance, general purpose Remote Shuffle Service.☆446Feb 12, 2026Updated 2 weeks ago
- ☆16Mar 3, 2021Updated 4 years ago
- A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…☆303Oct 30, 2025Updated 3 months ago
- ☆34Oct 29, 2025Updated 3 months ago
- A cross platform way to express data transformation, relational algebra, standardized record expression and plans.☆1,476Updated this week
- Apache datasketches☆40Jan 20, 2026Updated last month
- ☆15Feb 16, 2026Updated last week
- GKE cluster using Litmus Chaos Engine to validate Zebrium's unsupervised Machine Learning incident detection platform☆18Jun 2, 2023Updated 2 years ago
- This project contains a couple of tools to analyze data around the Apache Flink community.☆18May 22, 2024Updated last year
- Code for blog posts on string search algorithms.☆17Mar 4, 2020Updated 5 years ago
- BSON codecs for Java 8 Date and Time API (JSR-310).☆20Sep 2, 2019Updated 6 years ago
- Apache DataFusion Comet Spark Accelerator☆1,142Feb 20, 2026Updated last week
- A modular query optimizer for big data☆19Aug 30, 2023Updated 2 years ago
- Vimscript library of common functions.☆16May 19, 2017Updated 8 years ago
- Clickhouse sink for akka-streams☆16Jan 3, 2022Updated 4 years ago
- Fork of code.google.com/p/go.net☆19Jan 14, 2026Updated last month
- Duckdb extension to read pcap files☆48Sep 23, 2025Updated 5 months ago