Ackuq/spark-pit

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Ackuq/spark-pit)

Ackuq / spark-pit

Point-in-Time optimizations for Apache Spark

☆30

Alternatives and similar repositories for spark-pit

Users that are interested in spark-pit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

logicalclocks / feature-store-api
View on GitHub
Python - Java/Scala API for the Hopsworks feature store
☆55Sep 24, 2025Updated 10 months ago
SymbioticLab / Fluid
View on GitHub
A Generic Resource-Aware Hyperparameter Tuning Execution Engine
☆15Jan 8, 2022Updated 4 years ago
crflynn / sqlalchemy-databricks
View on GitHub
SQLAlchemy dialect for Databricks
☆20May 15, 2023Updated 3 years ago
logicalclocks / maggy
View on GitHub
Distribution transparent Machine Learning experiments on Apache Spark
☆91Feb 21, 2024Updated 2 years ago
zero323 / pyspark-asyncactions
View on GitHub
Asynchronous actions for PySpark
☆47Dec 2, 2021Updated 4 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
databrickslabs / tempo
View on GitHub
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins…
☆342Jul 10, 2026Updated 2 weeks ago
voltrondata / superset-sqlalchemy-adbc-flight-sql-poc
View on GitHub
A proof-of-concept repo that attempts to use Apache Superset with a custom ADBC to Arrow Flight SQL SQLAlchemy driver.
☆25Sep 8, 2023Updated 2 years ago
Kayrnt / duckdb_mysql_scanner
View on GitHub
DuckDB extension for MySQL
☆15Mar 17, 2024Updated 2 years ago
kompics / kompics
View on GitHub
Kompics - A message-passing component model for building distributed systems
☆67Oct 4, 2022Updated 3 years ago
cosmicexplorer / upc
View on GitHub
Ultra-high-performance local IPC framework with Zipkin tracing to conduct a beautiful symphony of (brotherhood) build tooling.
☆10Jan 8, 2021Updated 5 years ago
jwills / nba_monte_carlo
View on GitHub
The Modern Data Stack in a (Smaller) Box
☆12Jan 28, 2023Updated 3 years ago
decodableco / dbt-decodable
View on GitHub
A dbt adapter for Decodable
☆12Sep 4, 2025Updated 10 months ago
alibaba / feathub
View on GitHub
FeatHub - A stream-batch unified feature store for real-time machine learning
☆349May 27, 2024Updated 2 years ago
radanalyticsio / silex
View on GitHub
something to help you spark
☆65Oct 23, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
microsoft / lightgbm-benchmark
View on GitHub
Benchmark tools for LightGBM
☆15Jul 28, 2023Updated 3 years ago
dbt-labs / dbt_faker
View on GitHub
☆20Dec 4, 2024Updated last year
Daniel-Liu-c0deb0t / simple-saca
View on GitHub
Hardware go brrr bounded context suffix array construction algorithm
☆19Nov 1, 2023Updated 2 years ago
WinVector / WVLPSolver
View on GitHub
Experimental pure Java revised simplex linear program solver (Apache 2.0 license)
☆15Jun 22, 2020Updated 6 years ago
banksean / sand
View on GitHub
local development sandbox containers
☆19Updated this week
EKarton / Names-To-Nationality-Predicter
View on GitHub
A web application that predicts the nationality of a person's name
☆10Dec 23, 2024Updated last year
redis-applied-ai / redis-feast-gcp
View on GitHub
A demo of Redis Enterprise as the Online Feature Store deployed on GCP with Feast and NVIDIA Triton Inference Server.
☆15May 9, 2023Updated 3 years ago
ds2-lab / LambdaFS
View on GitHub
λFS: an elastic, high-performance, serverless-function-based metadata service for large-scale distributed file systems (ACM ASPLOS'23)
☆14Apr 2, 2025Updated last year
Upsolver / iceberg-diag
View on GitHub
☆30Dec 4, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
trafiklab / trafiklab.se
View on GitHub
Trafiklabs website
☆17Jun 24, 2026Updated last month
fredrikhgrelland / data-mesh
View on GitHub
A cloud native data mesh implementation
☆12Jan 15, 2021Updated 5 years ago
banzuzi-carioni / cross-border-electricity-flow-prediction
View on GitHub
Serverless ML system to predict the direction and volume of electricity flows to and from the Netherlands and its energy transmission par…
☆11Apr 7, 2025Updated last year
dcatkth / readinggroup
View on GitHub
☆12Apr 10, 2020Updated 6 years ago
aboisvert / skiis
View on GitHub
Skiis: Streaming + Parallel collection for Scala
☆17Mar 18, 2025Updated last year
fabianzeiher / transitmap
View on GitHub
Transitmap is an interactive realtime visualisation of all public transport in Sweden.
☆11Jun 6, 2025Updated last year
kaaveland / pyarrowfs-adlgen2
View on GitHub
Use pyarrow with Azure Data Lake gen2
☆29Jun 27, 2024Updated 2 years ago
aerugo / kolada-mcp
View on GitHub
An MCP server for Kolada.
☆16Nov 30, 2025Updated 7 months ago
saeid93 / seldon-inference-pipelines
View on GitHub
Examples of inference pipelines implemented using https://github.com/SeldonIO/seldon-core
☆14Feb 1, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
keras-team / keras-autodoc
View on GitHub
Documentation autogeneration utilities.
☆30Sep 13, 2022Updated 3 years ago
mag1cfrog / timeseries-table-format
View on GitHub
Rust-native time-series table format with gap/overlap tracking and SQL queries
☆16Mar 18, 2026Updated 4 months ago
composable-logs / composable-logs
View on GitHub
Python library to run ML/data pipelines on stateless compute infrastructure (that may be ephemeral or serverless). Please see the documen…
☆18May 23, 2023Updated 3 years ago
netj / 3x
View on GitHub
3X — a Workbench for eXecutable eXploratory eXperiments
☆21Dec 8, 2015Updated 10 years ago
fehmicansaglam / es-repl
View on GitHub
Elasticsearch REPL built on top of Jest
☆23May 12, 2015Updated 11 years ago
aws-samples / apache-xtable-on-aws-samples
View on GitHub
☆11Updated this week
twosigma / flint
View on GitHub
A Time Series Library for Apache Spark
☆1,173Jul 3, 2020Updated 6 years ago