Point-in-Time optimizations for Apache Spark
☆30Jan 18, 2024Updated 2 years ago
Alternatives and similar repositories for spark-pit
Users that are interested in spark-pit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Generic Resource-Aware Hyperparameter Tuning Execution Engine☆15Jan 8, 2022Updated 4 years ago
- Asynchronous actions for PySpark☆48Dec 2, 2021Updated 4 years ago
- A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.☆25Sep 8, 2023Updated 2 years ago
- The Modern Data Stack in a (Smaller) Box☆12Jan 28, 2023Updated 3 years ago
- Ultra-high-performance local IPC framework with Zipkin tracing to conduct a beautiful symphony of (brotherhood) build tooling.☆10Jan 8, 2021Updated 5 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Kompics - A message-passing component model for building distributed systems☆67Oct 4, 2022Updated 3 years ago
- A dbt adapter for Decodable☆12Sep 4, 2025Updated 8 months ago
- something to help you spark☆65Oct 23, 2018Updated 7 years ago
- Reproducing Distributed Systems and Experiments on Cloud☆40Sep 11, 2023Updated 2 years ago
- C++ coroutine protocol library.☆12Sep 2, 2025Updated 8 months ago
- Shapley value calculation in Java☆13Dec 14, 2021Updated 4 years ago
- A repository for all code generated at our Datadive events☆36May 12, 2012Updated 13 years ago
- A demo of Redis Enterprise as the Online Feature Store deployed on GCP with Feast and NVIDIA Triton Inference Server.☆15May 9, 2023Updated 3 years ago
- λFS: an elastic, high-performance, serverless-function-based metadata service for large-scale distributed file systems (ACM ASPLOS'23)☆14Apr 2, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- FAEST reference implementation☆19Apr 20, 2026Updated 2 weeks ago
- ☆19Updated this week
- ☆30Dec 4, 2024Updated last year
- Documentation for Hopsworks and Hops☆10Jan 30, 2022Updated 4 years ago
- Python SDK to interact with the Hopsworks API☆14Updated this week
- A Singer.io target for DuckDB☆19Feb 11, 2026Updated 2 months ago
- [SIGMOD'25] We show the data chunk compaction problem in vectorized execution, and propose practical compaction solutions.☆14Oct 10, 2025Updated 6 months ago
- A cloud native data mesh implementation☆12Jan 15, 2021Updated 5 years ago
- This repo provides C++ implementation of FHE-based unbalanced private set union (PSU).☆13Jun 21, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- A web application that predicts the nationality of a person's name☆10Dec 23, 2024Updated last year
- Trafiklabs website☆18Apr 20, 2026Updated 2 weeks ago
- A kubernetes operator for managing nvidia MIG instances.☆16Aug 26, 2020Updated 5 years ago
- Mahout vector encoding for pig☆53Nov 20, 2022Updated 3 years ago
- Examples of inference pipelines implemented using https://github.com/SeldonIO/seldon-core☆14Feb 1, 2023Updated 3 years ago
- ☆21Jan 25, 2023Updated 3 years ago
- CUDA kernel and JNI code which is called by Apache Spark's MLlib.☆19Jun 18, 2016Updated 9 years ago
- Use pyarrow with Azure Data Lake gen2☆28Jun 27, 2024Updated last year
- Lahinch surf predictions with Hopsworks☆15May 21, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An analysis of adverse drug event data using Hadoop, R, and Gephi☆44Jan 28, 2016Updated 10 years ago
- ☆20Feb 2, 2024Updated 2 years ago
- a curated list of awesome lakehouse frameworks, applications, etc☆45Mar 9, 2026Updated 2 months ago
- Distributed solver library for large-scale structured output prediction, based on Spark. Project website:☆17Mar 3, 2016Updated 10 years ago
- Python library to run ML/data pipelines on stateless compute infrastructure (that may be ephemeral or serverless). Please see the documen…☆18May 23, 2023Updated 2 years ago
- A Time Series Library for Apache Spark☆1,023Jul 3, 2020Updated 5 years ago
- ☆109Jul 5, 2023Updated 2 years ago