A flake8 plugin that detects of usage withColumn in a loop or inside reduce
☆28Jun 20, 2025Updated 11 months ago
Alternatives and similar repositories for flake8-pyspark-with-column
Users that are interested in flake8-pyspark-with-column are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SparkConnect Server plugin and protobuf messages for the Amazon Deequ Data Quality Engine.☆26Feb 22, 2025Updated last year
- PySpark schema generator☆44Feb 23, 2023Updated 3 years ago
- Incan: a modern, Pythonic language that compiles to Rust! Type-safe, async-friendly, with fixtures, testing, and web/inter-op built in.☆29Updated this week
- ScaleDP is an Open-Source extension of Apache Spark for Document Processing☆18Dec 2, 2025Updated 6 months ago
- Visits sessionization pipeline used for the talk☆13May 28, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Delta Lake helper methods in PySpark☆329Jan 19, 2026Updated 4 months ago
- Write-Audit-Publish on the lakehouse in pure Python with bauplan and DBOS☆13Jan 8, 2025Updated last year
- Репозиторий курса "Modern Storages and Data Warehousing", ПИ, НИУ ВШЭ, 2024☆14Apr 13, 2025Updated last year
- csv and flat-file sniffer built in Rust.☆45Jan 26, 2024Updated 2 years ago
- In this repository, we show how to get started with data lineage on AWS using OpenLineage. This is an AWS Cloud Development Kit project (…☆13Jul 25, 2024Updated last year
- ☆19Jul 8, 2024Updated last year
- Find your pause - by Hanoa Studio☆91Apr 6, 2026Updated 2 months ago
- ☆11Sep 23, 2019Updated 6 years ago
- Python wrapper for lsm1 extension for sqlite4☆15Feb 27, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Dynamic tiles from CMR queries☆25Jun 10, 2026Updated last week
- Demonstrating the capabilities of DuckDB as a transformation engine for data lakes☆35Oct 8, 2024Updated last year
- 🤖 An autonomous AI agent system that collaboratively designs, implements, and manages Apache Airflow DAGs through natural language inter…☆28Aug 6, 2025Updated 10 months ago
- Complete Guide To Mastering Databricks☆47Feb 28, 2026Updated 3 months ago
- A write-audit-publish implementation on a data lake without the JVM☆45Aug 12, 2024Updated last year
- An SBT Plugin that acts as a light wrapper around Buf.☆10Oct 29, 2024Updated last year
- A dbt package with a POC implementation of an interface to query activity streams that adhere to the Activity Schema 2.0 spec.☆17May 28, 2026Updated 3 weeks ago
- Reels is a library for analyzing sequences of events from transactional data to predict when related target events may occur in the futur…☆15Feb 17, 2026Updated 4 months ago
- GitHub Actions Pipeline with a FastAPI Application built, tested and deployed to DockerHub.☆18Sep 9, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Geospatial python toolkit: common functions, easy CLI creation, dataframes streams☆19May 16, 2024Updated 2 years ago
- ☆10Aug 23, 2023Updated 2 years ago
- Flowchart for debugging Spark applications☆104Sep 25, 2024Updated last year
- ✨ A Pydantic to PySpark schema library☆127May 24, 2026Updated 3 weeks ago
- Simple demo using "behave" and "pyspark" libraries to test data transformations in a human-readable way☆10Apr 5, 2019Updated 7 years ago
- Lightweight REST API for DuckDB with HTTP/2 streaming support.☆59Jun 12, 2026Updated last week
- Oozie Samples☆51Jan 11, 2014Updated 12 years ago
- Example of project using Databricks Asset Bundle☆45Aug 6, 2024Updated last year
- Generate CSV timesheet from your git repositories☆19Mar 11, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- PySpark test helper methods with beautiful error messages☆769May 20, 2026Updated 3 weeks ago
- A DuckDB extension to choose file interactively using native file open dialogs☆15May 27, 2026Updated 3 weeks ago
- Browse GitHub repos without cloning☆62Updated this week
- Convert MUSE from TensorFlow to PyTorch and ONNX☆11May 22, 2024Updated 2 years ago
- Lazily initialized ASGI apps☆12Jan 21, 2025Updated last year
- Implementation of core-expansion algorithm☆11Jan 26, 2026Updated 4 months ago
- Deploying a simple FastAPI app to Fly.io >> https://fly-fastapi.fly.dev/docs <<☆14Oct 2, 2023Updated 2 years ago