SemyonSinchenko / flake8-pyspark-with-columnView external linksLinks
A flake8 plugin that detects of usage withColumn in a loop or inside reduce
☆28Jun 20, 2025Updated 7 months ago
Alternatives and similar repositories for flake8-pyspark-with-column
Users that are interested in flake8-pyspark-with-column are comparing it to the libraries listed below
Sorting:
- SparkConnect Server plugin and protobuf messages for the Amazon Deequ Data Quality Engine.☆26Feb 22, 2025Updated 11 months ago
- ScaleDP is an Open-Source extension of Apache Spark for Document Processing☆17Dec 2, 2025Updated 2 months ago
- ☆16Oct 17, 2024Updated last year
- PySpark schema generator☆44Feb 23, 2023Updated 2 years ago
- Tools for Microsoft Fabric☆24Jul 17, 2025Updated 6 months ago
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆27Mar 25, 2024Updated last year
- PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it☆78Apr 27, 2025Updated 9 months ago
- Delta Lake helper methods in PySpark☆327Jan 19, 2026Updated 3 weeks ago
- Implementation of core-expansion algorithm☆11Jan 26, 2026Updated 3 weeks ago
- Parent repository for the MOJ Analytics Platform☆14Nov 16, 2021Updated 4 years ago
- See Apache Kylin Website for a complete description☆30May 28, 2018Updated 7 years ago
- An SBT Plugin that acts as a light wrapper around Buf.☆10Oct 29, 2024Updated last year
- Reproducible Analytical Pipeline of the Hospital Standardised Mortality Ratio (HSMR) quarterly publication☆11Jun 21, 2024Updated last year
- The privacy-preserving record linkage toolkit: a proof-of-concept public demo of next-gen data linkage techniques.☆15May 22, 2024Updated last year
- Examples of Selenium in Python☆11Jun 11, 2018Updated 7 years ago
- A Python wrapper for the Iterable API☆12Jan 7, 2026Updated last month
- ☆11Jan 28, 2019Updated 7 years ago
- Cl app / pre-commit hook to clean Jupyter Notebooks metadata, execution_count and optionally output.☆11Mar 3, 2025Updated 11 months ago
- Single node Cloudera environment in docker☆10Jan 16, 2016Updated 10 years ago
- Code to implement the network histogram (Olhede and Wolfe, arXiv:1312.5306)☆11Sep 23, 2014Updated 11 years ago
- TBD☆10Oct 30, 2015Updated 10 years ago
- ☆12Feb 21, 2022Updated 3 years ago
- Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool☆14Dec 12, 2025Updated 2 months ago
- Helpful user defined fuctions / table generating functions for Hive☆101May 2, 2016Updated 9 years ago
- Advanced parsing of structured data using Python's new match statement☆13Jan 15, 2025Updated last year
- Plugin to support XQuery + MarkLogic debugging in Intellij Idea☆12May 4, 2022Updated 3 years ago
- ☆10Sep 14, 2018Updated 7 years ago
- ☆11Mar 1, 2024Updated last year
- Records the execution of .NET programs, to create scenarios in AppMap files☆12Jul 25, 2024Updated last year
- Simple examples showing how to use ADBC with various databases, query engines, and data platforms☆37Updated this week
- TPC-H Benchmark on Cloudera Impala☆19Apr 25, 2013Updated 12 years ago
- ☆10Jul 21, 2022Updated 3 years ago
- 基礎統計學 powered by Jupyter Book☆11Aug 15, 2021Updated 4 years ago
- Embedded Linux☆11Jul 11, 2024Updated last year
- Convert MUSE from TensorFlow to PyTorch and ONNX☆11May 22, 2024Updated last year
- freakin' simple yo api wrapper for nodejs☆17Dec 18, 2014Updated 11 years ago
- csv and flat-file sniffer built in Rust.☆45Jan 26, 2024Updated 2 years ago
- ✨ A Pydantic to PySpark schema library☆119Updated this week
- IBM Spectrum LSF - IBM Cloud☆16Sep 30, 2024Updated last year