apache/datasketches-java

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/apache/datasketches-java)

apache / datasketches-java

A software library of stochastic streaming algorithms, a.k.a. sketches.

☆957

Alternatives and similar repositories for datasketches-java

Users that are interested in datasketches-java are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

apache / datasketches-memory
View on GitHub
High performance native memory access for Java.
☆134Jul 13, 2026Updated last week
apache / datasketches-cpp
View on GitHub
Core C++ Sketch Library
☆267Jul 11, 2026Updated last week
apache / datasketches-hive
View on GitHub
Sketch adaptors for Hive.
☆51May 15, 2026Updated 2 months ago
apache / datasketches-website
View on GitHub
Website for DataSketches.
☆109Jul 4, 2026Updated 2 weeks ago
addthis / stream-lib
View on GitHub
Stream summarizer and cardinality estimator.
☆2,265Nov 28, 2019Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
apache / datasketches
View on GitHub
A software library of stochastic streaming algorithms, a.k.a. sketches.
☆116May 15, 2026Updated 2 months ago
apache / datasketches-pig
View on GitHub
Sketch adaptors for Pig.
☆10May 15, 2026Updated 2 months ago
apache / datasketches-vector
View on GitHub
Sketch Library for vector-based models
☆15May 15, 2026Updated 2 months ago
apache / pinot
View on GitHub
Apache Pinot - A realtime distributed OLAP datastore
☆6,116Updated this week
RoaringBitmap / RoaringBitmap
View on GitHub
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache Pinot, Tablesaw, and many others
☆3,896Jul 10, 2026Updated last week
yahoo / Oak
View on GitHub
A Scalable Concurrent Key-Value Map for Big Data Analytics
☆275Jan 18, 2024Updated 2 years ago
apache / incubator-heron
View on GitHub
Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter
☆3,629Mar 1, 2023Updated 3 years ago
apache / druid
View on GitHub
Apache Druid: a high performance real-time analytics database.
☆14,033Updated this week
YahooArchive / anthelion
View on GitHub
Anthelion is a plugin for Apache Nutch to crawl semantic annotations within HTML pages.
☆2,830Dec 17, 2015Updated 10 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
apache / calcite
View on GitHub
Apache Calcite
☆5,157Updated this week
filodb / FiloDB
View on GitHub
Distributed Prometheus time series database
☆1,468Updated this week
sameeragarwal / blinkdb
View on GitHub
BlinkDB: Sub-Second Approximate Queries on Very Large Data.
☆660Feb 6, 2014Updated 12 years ago
apache / gobblin
View on GitHub
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, orga…
☆2,270Jun 24, 2026Updated 3 weeks ago
apache / carbondata
View on GitHub
High performance data store solution
☆1,448Jul 4, 2026Updated 2 weeks ago
pravega / pravega
View on GitHub
Pravega - Streaming as a new software defined storage primitive
☆1,998Mar 2, 2025Updated last year
apache / datasketches-postgresql
View on GitHub
PostgreSQL extension providing approximate algorithms based on apache/datasketches-cpp
☆94May 15, 2026Updated 2 months ago
TIBCOSoftware / snappydata
View on GitHub
Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…
☆1,032Nov 21, 2022Updated 3 years ago
srikalyc / Sql4D
View on GitHub
Sql interface to druid.
☆78Dec 14, 2015Updated 10 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
linkedin / PalDB
View on GitHub
An embeddable write-once key-value store written in Java
☆937Dec 2, 2019Updated 6 years ago
hbutani / spark-druid-olap
View on GitHub
Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit…
☆281Aug 3, 2018Updated 7 years ago
linkedin / ambry
View on GitHub
Distributed object store
☆1,787Updated this week
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
airlift / slice
View on GitHub
Java library for efficiently working with flat heap memory
☆518Updated this week
atomix / atomix
View on GitHub
A Kubernetes toolkit for building distributed applications using cloud native principles
☆2,363Jun 23, 2024Updated 2 years ago
apache / ratis
View on GitHub
Open source Java implementation for Raft consensus protocol.
☆1,464Updated this week
vigna / fastutil
View on GitHub
fastutil extends the Java™ Collections Framework by providing type-specific maps, sets, lists and queues.
☆2,207Dec 2, 2025Updated 7 months ago
fast-pack / JavaFastPFOR
View on GitHub
A low-level integer compression library in Java
☆569Jun 24, 2026Updated 3 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
bullet-db / bullet-storm
View on GitHub
The Apache Storm implementation of the Bullet backend
☆41Apr 17, 2023Updated 3 years ago
janino-compiler / janino
View on GitHub
Janino is a super-small, super-fast Java™ compiler.
☆1,325Jul 4, 2026Updated 2 weeks ago
apache / flink-statefun
View on GitHub
Apache Flink Stateful Functions
☆535May 15, 2026Updated 2 months ago
tdunning / t-digest
View on GitHub
A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means
☆2,162Feb 17, 2025Updated last year
prestodb / presto
View on GitHub
The official home of the Presto distributed SQL query engine for big data
☆16,719Updated this week
apache / hawq
View on GitHub
Apache HAWQ
☆696May 16, 2024Updated 2 years ago
aggregateknowledge / java-hll
View on GitHub
Java library for the HyperLogLog algorithm
☆318Feb 7, 2018Updated 8 years ago