opensearch-project / opensearch-sparkLinks
Spark Accelerator framework ; It enables secondary indices to remote data stores.
☆39Updated 2 weeks ago
Alternatives and similar repositories for opensearch-spark
Users that are interested in opensearch-spark are comparing it to the libraries listed below
Sorting:
- Query your data using familiar SQL or intuitive Piped Processing Language (PPL)☆157Updated this week
- ☆42Updated this week
- OpenSearch Benchmark - a community driven, open source project to run performance tests for OpenSearch☆134Updated last month
- Search Request Processor: pipeline for transformation of queries and results inline with a search request.☆26Updated 2 months ago
- Analytics Accelerator Library for Amazon S3 is an open source library that accelerates data access from client applications to Amazon S3.☆65Updated last month
- ☆25Updated last year
- Amundsen Gremlin☆21Updated 3 years ago
- Apache datasketches☆102Updated 2 weeks ago
- Spline agent for Apache Spark☆200Updated 2 weeks ago
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆144Updated 4 months ago
- Multi-hop declarative data pipelines☆122Updated last week
- Best practices and recommendations for getting started with Amazon EMR on EKS.☆67Updated 6 months ago
- Unity Catalog UI☆43Updated last year
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆292Updated this week
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- Point-in-Time optimizations for Apache Spark☆30Updated last year
- Enables synchronizing metadata changes (Create/Drop table/partition) from Hive Metastore to AWS Glue Data Catalog☆35Updated 2 years ago
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆136Updated 2 years ago
- Open Control Plane for Tables in Data Lakehouse☆376Updated this week
- The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog a…☆226Updated 9 months ago
- a curated list of awesome lakehouse frameworks, applications, etc☆37Updated last month
- A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to you…☆261Updated this week
- ☆81Updated 8 months ago
- ☆40Updated 2 years ago
- Smart Automation Tool for building modern Data Lakes and Data Pipelines☆123Updated 3 weeks ago
- Official workloads used by OpenSearch Benchmark (OSB)☆28Updated last week
- Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!☆235Updated 11 months ago
- ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related …☆136Updated this week
- Neural search transforms text into vectors and facilitates vector search both at ingestion time and at search time.☆104Updated this week
- Website for DataSketches.☆107Updated this week