opensearch-project / opensearch-sparkLinks
Spark Accelerator framework ; It enables secondary indices to remote data stores.
☆37Updated 3 weeks ago
Alternatives and similar repositories for opensearch-spark
Users that are interested in opensearch-spark are comparing it to the libraries listed below
Sorting:
- Query your data using familiar SQL or intuitive Piped Processing Language (PPL)☆150Updated this week
- OpenSearch Benchmark - a community driven, open source project to run performance tests for OpenSearch☆130Updated last week
- Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-dis…☆21Updated last year
- Apache datasketches☆99Updated 2 years ago
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆135Updated last year
- ☆38Updated last week
- Identify atypical data and receive automatic notifications☆80Updated this week
- Search Request Processor: pipeline for transformation of queries and results inline with a search request.☆26Updated 8 months ago
- Spline agent for Apache Spark☆197Updated last month
- Multi-hop declarative data pipelines☆120Updated 2 weeks ago
- ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related …☆129Updated this week
- Apache Wayang(incubating) is the first cross-platform data processing system.☆234Updated this week
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆139Updated last month
- Idempotent query executor☆53Updated 5 months ago
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆242Updated last week
- ☆92Updated this week
- ☆70Updated 9 months ago
- Port of TPC-DS dsdgen to Java☆52Updated last year
- ☆30Updated 4 months ago
- Spark-Radiant is Apache Spark Performance and Cost Optimizer☆26Updated 9 months ago
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis.☆106Updated 5 months ago
- Website for DataSketches.☆104Updated 2 weeks ago
- ☆80Updated 5 months ago
- A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…☆302Updated this week
- Java bindings for https://github.com/facebookincubator/velox☆33Updated last week
- 🆕 Find the k-nearest neighbors (k-NN) for your vector data☆200Updated this week
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆100Updated 2 years ago
- ☆228Updated last week
- ☆25Updated last year
- Neural search transforms text into vectors and facilitates vector search both at ingestion time and at search time.☆98Updated this week