Spark fires is a anti-pattern playground where we deliberately break Spark applications in various ways so you can observe what happens and potentially recognise the issue when you come across it in your day-to-day development and support activities.
☆42Nov 18, 2024Updated last year
Alternatives and similar repositories for spark-fires
Users that are interested in spark-fires are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Release notes for Apache Spark based Runtime for Azure Synapse Analytics and Microsoft Fabric☆36Updated this week
- Tools for Microsoft Fabric☆25Jul 17, 2025Updated 10 months ago
- How to run DBT on AWS Fargate☆13Oct 15, 2019Updated 6 years ago
- Genie Framework improves Spark Pool utilization by executing multiple Synapse notebooks on the same spark pool instance☆28Dec 19, 2023Updated 2 years ago
- ☆15Aug 28, 2025Updated 8 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆13May 12, 2026Updated last week
- Delta Lake helper methods in PySpark☆329Jan 19, 2026Updated 4 months ago
- Fast data quality framework for modern data infrastructure☆29Apr 2, 2026Updated last month
- Collect and aggregate on spark events for profitz☆10Apr 22, 2022Updated 4 years ago
- Type-annotate your spark dataframes and validate them☆14Feb 5, 2026Updated 3 months ago
- Python Package for ducklake☆20Jun 5, 2025Updated 11 months ago
- command launcher organised in a tree structure with autocompletion☆13May 4, 2022Updated 4 years ago
- Data Engineering framework written in Python based in Polars.☆14May 1, 2024Updated 2 years ago
- Volunteer guide, and other materials for DATA RESCUE PDX☆29Mar 4, 2017Updated 9 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Instructions and code for the workshop "From Big Data to NLP Insights: Unlocking the Power of PySpark and Spark NLP"☆12May 9, 2023Updated 3 years ago
- Delta Lake examples☆239Oct 8, 2024Updated last year
- A project to design a fact and dimension star schema for optimizing queries on a flight booking database using PostgreSQL, a relational d…☆12Aug 15, 2021Updated 4 years ago
- Keyboard-first dotfiles for terminal-centric development with tmux, Neovim, and coding agents.☆25May 11, 2026Updated last week
- High performance async Mssql library for Python.☆22Updated this week
- My personal dotfiles with automated macOS setup. Features smart installation scripts, Bats testing (bash), performance monitoring, and 2…☆11Apr 24, 2026Updated 3 weeks ago
- Advanced Ocean Simulation for Unreal Engine 5 using the Niagara system and C++. Designed to enhance FPS with high-performance mesh and wi…☆11Aug 19, 2024Updated last year
- A low-overhead sampling profiler for PySpark, that outputs Flame Graphs☆16Dec 17, 2020Updated 5 years ago
- ACID and BASE transactions explained☆15May 18, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The SEAL-CPU backend is a Reference backend engine for HEBench which is a shared library that implements the required functions specified…☆11Mar 3, 2023Updated 3 years ago
- A Particle System implemented in android, handling collinsions, optimized for performance☆10Dec 18, 2023Updated 2 years ago
- An SBT Plugin that acts as a light wrapper around Buf.☆10Oct 29, 2024Updated last year
- Power BI External Tool to run automated checks in a report☆21May 23, 2023Updated 2 years ago
- Data Lineage for Spark components and PowerBI/AAS showing up in Azure Purview☆20Jun 11, 2024Updated last year
- Discover Netflix's Open Connect Appliance (OCA) assigned to your connection. This tool fetches and displays detailed connectivity and hos…☆19Jul 22, 2025Updated 9 months ago
- Optimizing loading training data from cloud bucket storage for cloud-based distributed deep learning. Official repository for Quantifying…☆11Jan 1, 2022Updated 4 years ago
- ☆25Feb 14, 2025Updated last year
- Portfolio and blog for my online brand. Performance Optimized Single Page React App.☆11Nov 7, 2016Updated 9 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A declarative PySpark framework for row- and aggregate-level data quality validation.☆73Jan 1, 2026Updated 4 months ago
- Data quality tools for Big Data☆19Oct 10, 2019Updated 6 years ago
- Turning PySpark Into a Universal DataFrame API☆506May 12, 2026Updated last week
- A survey app written in Flask☆13Apr 16, 2018Updated 8 years ago
- Portable Neovim configuration built with Nix.☆18May 1, 2026Updated 2 weeks ago
- Private-AI is an innovative AI project designed for asking questions about your documents using powerful Large Language Models (LLMs). Th…☆24Feb 26, 2024Updated 2 years ago
- Yet Another (Spark) ETL Framework☆21Oct 21, 2023Updated 2 years ago