opensearch-project / opensearch-sparkLinks
Spark Accelerator framework ; It enables secondary indices to remote data stores.
☆37Updated last month
Alternatives and similar repositories for opensearch-spark
Users that are interested in opensearch-spark are comparing it to the libraries listed below
Sorting:
- Query your data using familiar SQL or intuitive Piped Processing Language (PPL)☆150Updated this week
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆139Updated 2 months ago
- OpenSearch Benchmark - a community driven, open source project to run performance tests for OpenSearch☆131Updated last week
- Spline agent for Apache Spark☆199Updated this week
- ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related …☆130Updated this week
- ☆38Updated last week
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆288Updated this week
- Search Request Processor: pipeline for transformation of queries and results inline with a search request.☆26Updated last week
- Apache datasketches☆101Updated 2 years ago
- Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL (with deep transformation of functions, data type…☆60Updated this week
- Apache flink☆22Updated 3 months ago
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆135Updated 2 years ago
- This project provides fully automated one-click experience to create Cloud and Kubernetes environment to run Data Analytics workload like…☆55Updated 2 years ago
- ☆25Updated last year
- Multi-hop declarative data pipelines☆122Updated this week
- ☆31Updated this week
- Apache Iceberg Documentation Site☆42Updated last year
- Idempotent query executor☆53Updated 6 months ago
- ☆80Updated 6 months ago
- Apache flink☆73Updated 3 months ago
- The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog a…☆225Updated 7 months ago
- Analytics Accelerator Library for Amazon S3 is an open source library that accelerates data access from client applications to Amazon S3.☆56Updated this week
- Apache Wayang(incubating) is the first cross-platform data processing system.☆234Updated last week
- A library that brings useful functions from various modern database management systems to Apache Spark☆60Updated 2 years ago
- Trino connectors for accessing APIs with an OpenAPI spec☆38Updated 3 weeks ago
- Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!☆233Updated 9 months ago
- 🆕 Find the k-nearest neighbors (k-NN) for your vector data☆201Updated this week
- Storage connector for Trino☆116Updated last week
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆84Updated 2 weeks ago
- Mirror of Apache Ranger☆15Updated last year