apache / datasketches-vector
Sketch Library for vector-based models
☆13Updated last year
Alternatives and similar repositories for datasketches-vector:
Users that are interested in datasketches-vector are comparing it to the libraries listed below
- A framework for scalable graph computing.☆147Updated 6 years ago
- Graphulo: Accumulo library of matrix math primitives and graph algorithms☆78Updated 9 months ago
- Automatic offload of user-written Spark kernels to accelerators☆18Updated 8 years ago
- Bloofi: A java implementation of multidimensional Bloom filters☆79Updated 8 years ago
- This project describes the D4M 2.0 Schema used in many Accumulo systems.☆21Updated 4 years ago
- Dynamic Distributed Dimensional Data Model☆41Updated 9 months ago
- Mirror of Apache Clerezza☆36Updated 2 years ago
- ☆29Updated 3 years ago
- Java Matrix Benchmark is a tool for evaluating Java linear algebra libraries for speed, stability, and memory usage.☆59Updated last year
- ByteBuffer collection classes for java and jvm-based languages.☆33Updated 6 years ago
- Cascading on Apache Flink®☆54Updated last year
- Mirror of Apache MRQL (Incubating)☆17Updated 7 years ago
- Kiwi is a minimalist and extendable Constraint Programming (CP) solver.☆50Updated 5 years ago
- DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees.☆117Updated 8 months ago
- Bucketing and partitioning system for Parquet☆30Updated 6 years ago
- Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets☆92Updated 9 years ago
- Milan is a Scala API and runtime infrastructure for building data-oriented systems, built on top of Apache Flink.☆40Updated last year
- Routines and data structures for using isarn-sketches idiomatically in Apache Spark☆29Updated 8 months ago
- Moments Sketch Code☆40Updated 6 years ago
- SamzaSQL: Streaming SQL implementation on top of Apache Samza and Apache Kafka☆29Updated 8 years ago
- Library for organizing batch processing pipelines in Apache Spark☆41Updated 8 years ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆29Updated 9 years ago
- ☆57Updated 2 years ago
- SIMD Intrinsics in the JVM☆48Updated 7 years ago
- Spark GPU and SIMD Support☆61Updated 4 years ago
- The Musketeer workflow manager.☆41Updated 6 years ago
- Java numerics library for optimization, polynomial root finding, sorting, robust model fitting, and more.☆51Updated 2 months ago
- Fast, memory-efficient, minimal-serialization, binary data vectors for Scala and other languages☆67Updated 6 years ago
- Calliope is a library integrating Cassandra and Spark framework.☆28Updated 9 years ago
- Apache datasketches☆94Updated 2 years ago