Data quality control tool built on spark and deequ
☆25May 9, 2026Updated last month
Alternatives and similar repositories for data-flare
Users that are interested in data-flare are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Some Avro operations in Scala☆10Jun 21, 2026Updated last week
- A collection of “cookbook-style” scripts for simplifying data engineering and machine learning in Apache Spark.☆13Oct 27, 2021Updated 4 years ago
- Deriving Spark DataFrame schemas from case classes☆44Jun 24, 2024Updated 2 years ago
- Azure AI Camp - 2 day workshop on Databricks and Azure ML☆20Jul 23, 2023Updated 2 years ago
- Cloud based Data Platform based on Apache Spark☆28May 21, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Flink jobs collection☆17Oct 13, 2020Updated 5 years ago
- Some random how-to examples relating to Databricks.☆15Nov 3, 2021Updated 4 years ago
- ☆14Feb 10, 2026Updated 4 months ago
- OpenTelemetry agent for Scala applications☆72Updated this week
- An example of building kubernetes operator (Flink) using Abstract operator's framework☆26Jul 12, 2019Updated 6 years ago
- Drop in replacement for golang/crypto/ed25519 with additional functionality☆15Feb 28, 2023Updated 3 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆63Sep 6, 2024Updated last year
- Command line tool for converting images to ASCII art☆20Jun 4, 2026Updated 3 weeks ago
- Library to create portal like UIs☆13Mar 9, 2016Updated 10 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- OptaPlanner part of the Red Hat Summit 2019 Keynote Demo☆13Aug 1, 2024Updated last year
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆73Mar 14, 2021Updated 5 years ago
- For popular software systems, the number of daily submitted bug reports is high. Triaging these incoming bugs is a time consuming task. M…☆11Jan 8, 2016Updated 10 years ago
- DataQuality for BigData☆149Dec 15, 2023Updated 2 years ago
- Data Science Research Architecture, Data Center OS☆21May 12, 2016Updated 10 years ago
- My Study guide used to pass the CRT020 Spark Certification exam☆34Jan 6, 2020Updated 6 years ago
- A testlab built with Nomad and Consul to analyze the behavior of p2p networks at scale☆22Jul 26, 2019Updated 6 years ago
- The Stormlicht browser engine.☆15Aug 13, 2024Updated last year
- Push "button deploy" literally☆18Feb 15, 2016Updated 10 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- pdfChain: (experimental) blockchain for the masses☆16Feb 14, 2026Updated 4 months ago
- A bridge for capturing JMX data with JDK Flight Recorder☆18Aug 19, 2020Updated 5 years ago
- Prototype for a Coupon-Engine driven by Easy Rules and Spring Expression Language☆15Oct 3, 2024Updated last year
- Website for the vetiver 🏺 framework☆12May 28, 2025Updated last year
- I'll munch some data here☆12Jun 18, 2021Updated 5 years ago
- Apache Amaterasu☆56Oct 18, 2019Updated 6 years ago
- Sample applications using Dozer☆16Feb 3, 2024Updated 2 years ago
- On-demand port forwarding to k8s.☆26Apr 10, 2026Updated 2 months ago
- Stacktrace Clustering Library☆15May 20, 2012Updated 14 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This a simple Python daemon to monitor your Impala nodes.☆10Apr 13, 2021Updated 5 years ago
- Extensible streaming ingestion pipeline on top of Apache Spark☆47Jul 17, 2025Updated 11 months ago
- Spark-Radiant is Apache Spark Performance and Cost Optimizer☆25Dec 31, 2024Updated last year
- A Scala library for locality sensitive hashing☆14Aug 1, 2018Updated 7 years ago
- Effective Kafka☆59Apr 15, 2022Updated 4 years ago
- Repository that showcases problems with Kafka rebalancing and explains how to fix them. Please visit our blog article to learn what Kafka…☆12Aug 21, 2020Updated 5 years ago
- ☆39Jun 17, 2026Updated last week