opensearch-project / opensearch-sparkLinks
Spark Accelerator framework ; It enables secondary indices to remote data stores.
☆38Updated this week
Alternatives and similar repositories for opensearch-spark
Users that are interested in opensearch-spark are comparing it to the libraries listed below
Sorting:
- Query your data using familiar SQL or intuitive Piped Processing Language (PPL)☆155Updated this week
- OpenSearch Benchmark - a community driven, open source project to run performance tests for OpenSearch☆132Updated last week
- ☆40Updated last week
- Apache flink☆23Updated 4 months ago
- Multi-hop declarative data pipelines☆122Updated 2 weeks ago
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆142Updated 3 months ago
- Spline agent for Apache Spark☆200Updated last week
- ☆81Updated 7 months ago
- ml-commons provides a set of common machine learning algorithms, e.g. k-means, or linear regression, to help developers build ML related …☆133Updated this week
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆136Updated 2 years ago
- Apache flink☆74Updated 4 months ago
- Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!☆235Updated 10 months ago
- Neural search transforms text into vectors and facilitates vector search both at ingestion time and at search time.☆102Updated this week
- Tutorial on how to setup Trino and Apache Ranger using docker☆41Updated last year
- Apache Flink Stateful Functions Playground☆133Updated 2 years ago
- Apache Iceberg Documentation Site☆42Updated last year
- Analytics Accelerator Library for Amazon S3 is an open source library that accelerates data access from client applications to Amazon S3.☆64Updated 3 weeks ago
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆290Updated this week
- Identify atypical data and receive automatic notifications☆84Updated this week
- Apache datasketches☆101Updated 2 years ago
- ☆237Updated last week
- Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL (with deep transformation of functions, data type…☆63Updated 2 weeks ago
- ☆32Updated last week
- DynoYARN is a framework to run simulated YARN clusters and workloads for YARN scale testing.☆61Updated 2 years ago
- A testing framework for Trino☆26Updated 8 months ago
- Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful …☆145Updated last year
- 🆕 Find the k-nearest neighbors (k-NN) for your vector data☆203Updated this week
- LST-Bench is a framework that allows users to run benchmarks specifically designed for evaluating Log-Structured Tables (LSTs) such as De…☆87Updated last month
- User tools for Spark RAPIDS☆65Updated this week
- Smart Automation Tool for building modern Data Lakes and Data Pipelines☆123Updated this week