Salmon-Brain / dead-salmon-brainLinks
Apache Spark based framework for analysis A/B experiments
β15Updated 9 months ago
Alternatives and similar repositories for dead-salmon-brain
Users that are interested in dead-salmon-brain are comparing it to the libraries listed below
Sorting:
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data πβ33Updated 3 years ago
- Discover the simplicity and strength of Duckdb, dbt, and Iceberg in this project. Create an efficient, versatile data analytics solution β¦β35Updated last year
- Data quality control tool built on spark and deequβ25Updated 5 months ago
- Preliminary Solr DQ / Data Quality experiments and prototype, and SolrJ wrapper utilitiesβ26Updated 6 months ago
- This project is created to promote and advocate the use of FOSS machine learning.β46Updated 3 months ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.β57Updated 3 years ago
- A Model Context Protocol (MCP) server for discovering data products and requesting access in Data Mesh Manager, and executing queries on β¦β36Updated last week
- This repository contains code to build an MVP search engine with google like interface.β15Updated 2 months ago
- Using the Parquet file format (with Avro) to process data with Apache Flinkβ14Updated 9 years ago
- Data Catalog for Databases and Data Warehousesβ35Updated last year
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetesβ53Updated 5 years ago
- spark-drools tutorialsβ16Updated last year
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Storβ¦β41Updated 2 years ago
- Chatlytics is a data query and visualization platform for chat!β13Updated 8 years ago
- Process Automation, Data Management, Message Learning, AI Ops, and Quantum Opsβ11Updated this week
- Getting Great Expectations setup to run on DataBricks with Spark Dataframes.β13Updated 3 years ago
- π» CLI for reporting events to Faros platformβ14Updated 3 months ago
- β22Updated this week
- pysh-db - The Data Science Toolkit (DSK)β13Updated 6 years ago
- Provide an easy way with Python to protect your data sources by searching its metadata. π‘οΈβ17Updated this week
- Marquez Web UIβ21Updated 4 years ago
- β10Updated 3 years ago
- Open source task scheduler with dependency managementβ15Updated 7 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines anβ¦β61Updated 11 months ago
- A database schema conversion toolβ28Updated 5 years ago
- β14Updated 2 years ago
- SQLAlchemy dialect for Dremioβ16Updated 2 years ago
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms andβ¦β28Updated 2 years ago
- A systematic Benchmarking on the performance of Spark-SQL for processing Vast RDF datasetsβ14Updated 3 years ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ toβ¦β29Updated 8 months ago