Salmon-Brain / dead-salmon-brainLinks
Apache Spark based framework for analysis A/B experiments
β15Updated 11 months ago
Alternatives and similar repositories for dead-salmon-brain
Users that are interested in dead-salmon-brain are comparing it to the libraries listed below
Sorting:
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data πβ34Updated 3 years ago
- π» CLI for reporting events to Faros platformβ14Updated last month
- Data quality control tool built on spark and deequβ25Updated 6 months ago
- TensorFlow Processor for Spring Cloud Dataflowβ24Updated 8 years ago
- Open source task scheduler with dependency managementβ15Updated 7 years ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.β57Updated 3 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines anβ¦β62Updated last year
- Chatlytics is a data query and visualization platform for chat!β13Updated 8 years ago
- Preliminary Solr DQ / Data Quality experiments and prototype, and SolrJ wrapper utilitiesβ26Updated 8 months ago
- spark-drools tutorialsβ16Updated last year
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Sparkβ16Updated last year
- Friendly ML feature storeβ45Updated 3 years ago
- A Model Context Protocol (MCP) server for discovering data products and requesting access in Data Mesh Manager, and executing queries on β¦β38Updated last month
- β20Updated 2 years ago
- Standalone alternatives to Kafka Connect Connectorsβ45Updated last week
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Storβ¦β41Updated 2 years ago
- β10Updated 3 years ago
- β38Updated last year
- Examples for my book "Power Java"β21Updated 2 years ago
- A single source of truth for data definitionsβ11Updated 2 years ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ toβ¦β29Updated 9 months ago
- Data Catalog for Databases and Data Warehousesβ36Updated last year
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations oβ¦β52Updated 3 months ago
- pysh-db - The Data Science Toolkit (DSK)β13Updated 6 years ago
- An implementation of the DatasourceV2 interface of Apache Sparkβ’ for writing Spark Datasets to Apache Druidβ’.β43Updated last week
- Package to extend Airflow functionality with CWL v1.0 supportβ12Updated 6 years ago
- Scala API for Apache Spark SQL high-order functionsβ14Updated 2 years ago
- Dremio Flight connector. Access Dremio using Arrow flightβ40Updated 4 years ago
- β30Updated 2 years ago
- Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.β112Updated 5 years ago