Salmon-Brain / dead-salmon-brainLinks
Apache Spark based framework for analysis A/B experiments
☆15Updated last year
Alternatives and similar repositories for dead-salmon-brain
Users that are interested in dead-salmon-brain are comparing it to the libraries listed below
Sorting:
- Data quality control tool built on spark and deequ☆25Updated this week
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 3 years ago
- spark-drools tutorials☆16Updated last year
- 💻 CLI for reporting events to Faros platform☆14Updated 3 months ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆62Updated last year
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆37Updated 3 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆42Updated 2 years ago
- Friendly ML feature store☆45Updated 3 years ago
- Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline☆76Updated 2 years ago
- ☆30Updated 2 years ago
- OpenDMP - An Open-Source Data Management Platform☆33Updated 2 years ago
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Updated 10 years ago
- Open source task scheduler with dependency management☆15Updated 7 years ago
- ☆38Updated last year
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Updated 11 months ago
- Data Catalog for Databases and Data Warehouses☆35Updated last year
- This project is created to promote and advocate the use of FOSS machine learning.☆47Updated 6 months ago
- Chatlytics is a data query and visualization platform for chat!☆13Updated 8 years ago
- A single source of truth for data definitions☆11Updated 2 years ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 5 years ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆52Updated 5 months ago
- Documentation and resources for deploying JupyterHub on Hadoop☆19Updated 6 years ago
- pysh-db - The Data Science Toolkit (DSK)☆13Updated 6 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆39Updated 4 years ago
- Beneath is a serverless real-time data platform ⚡️☆84Updated 3 years ago
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆29Updated 5 years ago
- Code that was used as an example during the Data+AI Summit 2020☆15Updated 4 years ago
- Data abstraction, storage, discovery, and serving system☆33Updated last month
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated 2 years ago
- A systematic Benchmarking on the performance of Spark-SQL for processing Vast RDF datasets☆14Updated 3 years ago