mrpowers-io/tsumugi-spark

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mrpowers-io/tsumugi-spark)

mrpowers-io / tsumugi-spark

SparkConnect Server plugin and protobuf messages for the Amazon Deequ Data Quality Engine.

☆26

Alternatives and similar repositories for tsumugi-spark

Users that are interested in tsumugi-spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SemyonSinchenko / feature-generation-benchmark
View on GitHub
A database-like benchmark of feature generation from time-series data
☆13Nov 27, 2024Updated last year
Yorko / huggingface_text2image_yorko
View on GitHub
HuggingFace entry exercise by Yury Kashnitsky
☆14Aug 25, 2023Updated 2 years ago
ayasyrev / nbmetaclean
View on GitHub
Cl app / pre-commit hook to clean Jupyter Notebooks metadata, execution_count and optionally output.
☆11Mar 3, 2025Updated last year
mrpowers-io / jodie
View on GitHub
Delta lake and filesystem helper methods
☆51Feb 29, 2024Updated 2 years ago
assafmendelson / DataSourceV2
View on GitHub
☆23Oct 8, 2018Updated 7 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
vetka925 / llms-lora-8bit-4bit-finetuning-lit
View on GitHub
☆15Apr 11, 2023Updated 3 years ago
SemyonSinchenko / flake8-pyspark-with-column
View on GitHub
A flake8 plugin that detects of usage withColumn in a loop or inside reduce
☆28Jun 20, 2025Updated last year
spetlr-org / spetlr
View on GitHub
A python SPark ETL libRary (SPETLR) for Databricks. https://discord.gg/p9bzqGybVW
☆24Mar 3, 2026Updated 4 months ago
Keen-Technologies / physical-atari-rlc
View on GitHub
Code for the RLC paper "Physical Atari: A Robust and Accessible Platform for Real-time Reinforcement Learning on Robots"
☆17Jun 14, 2026Updated last month
dialogue-evaluation / Russian-News-Clustering-and-Headline-Generation
View on GitHub
☆18Jun 18, 2021Updated 5 years ago
harupy / dbvim
View on GitHub
Enable Vim on Databricks
☆17Jan 7, 2023Updated 3 years ago
nsphung / pyspark-template
View on GitHub
A Python PySpark Projet with Poetry
☆31May 2, 2026Updated 2 months ago
MrPowers / mack
View on GitHub
Delta Lake helper methods in PySpark
☆328Jan 19, 2026Updated 6 months ago
U-Company / python-private-service-layout
View on GitHub
python template private service
☆17Oct 20, 2020Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
slgero / receipt_parser
View on GitHub
Allow parsing Russian receipts
☆54Aug 14, 2020Updated 5 years ago
AlexAndorra / football-modeling
View on GitHub
Bayesian experiments for football insights
☆16Jun 11, 2026Updated last month
Thelin90 / datapains-trino-k8s
View on GitHub
Trino On K8S Via Helm & Metastore Workshop Querying Delta Tables
☆12Jan 27, 2025Updated last year
yutannihilation / duckdb-ext-file-dialog
View on GitHub
A DuckDB extension to choose file interactively using native file open dialogs
☆15Jun 22, 2026Updated 3 weeks ago
buriy / nlp_workshop
View on GitHub
nlp workshop at datafest siberia 2019
☆22Dec 8, 2022Updated 3 years ago
andrewRowlinson / optasoccer
View on GitHub
optasoccer is a Python library for reading opta soccer data
☆11Mar 14, 2024Updated 2 years ago
mshtelma / databricks-llm-fine-tuning
View on GitHub
☆17Oct 12, 2023Updated 2 years ago
ketchbrookanalytics / shiny_arrow
View on GitHub
Example code representing a real-life use case for using {arrow} to improve a Shiny application
☆17Jul 6, 2021Updated 5 years ago
kkudin / core-expansion
View on GitHub
Implementation of core-expansion algorithm
☆12Jun 30, 2026Updated 3 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
atomix / atomix-jepsen
View on GitHub
Atomix Jepsen tests
☆14Feb 7, 2017Updated 9 years ago
bigdatacup / Big-Data-Cup-2025
View on GitHub
☆15Dec 17, 2024Updated last year
datamindedbe / eu-data-platform
View on GitHub
Spin up a minimalistic Data Analytics Platform on a European cloud provider
☆19Apr 22, 2026Updated 2 months ago
jonathanbp / stardust
View on GitHub
A simple Java thread dump visualisation and analysis tool.
☆12Jan 29, 2016Updated 10 years ago
s71m / opentelemetry-loguru-telegram
View on GitHub
Custom otelcol-contrib with exporter to telegram. And handler for loguru
☆14Apr 8, 2025Updated last year
terror / edmv
View on GitHub
Bulk rename files with your favourite editor
☆15Nov 12, 2025Updated 8 months ago
Data-drone / DAIS2022-Scaling-Deep-Learning-Talk
View on GitHub
☆10Jul 1, 2022Updated 4 years ago
MatteoGuadrini / nosqlapi
View on GitHub
nosqlapi is a library for building standard NOSQL python libraries.
☆12Apr 5, 2022Updated 4 years ago
strikingly / redash-kylin
View on GitHub
Redash plugin for Apache Kylin integration
☆12Mar 21, 2018Updated 8 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
StabRise / ScaleDP
View on GitHub
ScaleDP is an Open-Source extension of Apache Spark for Document Processing
☆18Dec 2, 2025Updated 7 months ago
nguyendinhson-kaist / MMSports23-Seg-AutoID
View on GitHub
Our public repo ranked 1st 🏆🏆 at MMSports2023 challenge on segmentation task
☆16Oct 31, 2023Updated 2 years ago
Nike-Inc / spark-expectations
View on GitHub
A Python Library to support running data quality rules while the spark job is running⚡
☆201Jul 14, 2026Updated last week
UnravelSports / common-data-format-validator
View on GitHub
JSON Schema Validition for the Soccer Common Data Format
☆16Mar 19, 2026Updated 4 months ago
tresata / spark-columnar
View on GitHub
☆15Mar 4, 2015Updated 11 years ago
jricheimer / keras-metric-learning
View on GitHub
Metric Learning Library for Keras
☆10Apr 24, 2019Updated 7 years ago
bartosz25 / data-ai-summit-2024
View on GitHub
Visits sessionization pipeline used for the talk
☆13May 28, 2024Updated 2 years ago