Example Repo to have full end to end pyspark testing via docker-compose
☆31Feb 6, 2023Updated 3 years ago
Alternatives and similar repositories for pyspark-testing-env
Users that are interested in pyspark-testing-env are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of Boundary Attributions for Normal (Vector) Explanations☆11Aug 13, 2021Updated 4 years ago
- In this article, you will learn how to set up a real-time data processing and analytics environment using Docker, MySQL, Redpanda, MinIO,…☆11Jun 27, 2023Updated 2 years ago
- A production-ready PySpark project template with medallion architecture, Python packaging, unit tests, integration tests, CI/CD automatio…☆72Updated this week
- Easily import a module and mock its dependencies in an isolated way.☆13May 19, 2022Updated 4 years ago
- Example project for building scalable data pipelines with Kedro and Ibis.☆14Dec 10, 2025Updated 5 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A pyproject.toml conversion tool for Poetry to uv migration☆20Dec 28, 2024Updated last year
- A cloud data platform product to accelerate time to insights. Our open-source framework is designed for the real world. Stripping away th…☆25May 29, 2026Updated last week
- A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation an…☆24Nov 21, 2023Updated 2 years ago
- quadipy is a python package to help transform structured data into RDF graph format☆19Apr 14, 2023Updated 3 years ago
- HIVE: Evaluating the Human Interpretability of Visual Explanations (ECCV 2022)☆22Jan 19, 2023Updated 3 years ago
- ☆17May 26, 2025Updated last year
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆29Nov 24, 2022Updated 3 years ago
- Repository of notebooks and related collateral used in the Databricks Demo Hub, showing how to use Databricks, Delta Lake, MLflow, and mo…☆26May 27, 2021Updated 5 years ago
- Validates that all require statements in a project point to an existing path and are correctly cased.☆20Mar 30, 2014Updated 12 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A fast data generator that produces CSV files from generated relational data☆45Aug 15, 2025Updated 9 months ago
- phy-mer☆11Oct 12, 2017Updated 8 years ago
- Code for the MSB publication: Exploring amino acid functions and positional subtypes in a deep mutational landscape☆10Jun 11, 2022Updated 3 years ago
- A Terraform module to create and manage Identity and Access Management (IAM) Users on Amazon Web Services (AWS). https://aws.amazon.com/i…☆20Apr 6, 2022Updated 4 years ago
- Base-pair resolution detection of transcription factor binding site by deep deconvolutional network☆10Sep 12, 2017Updated 8 years ago
- ☆17May 22, 2024Updated 2 years ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆288Updated this week
- Making Databricks easy to use for R developers.☆26Oct 6, 2022Updated 3 years ago
- Making Time Speak! 🎙️☆29May 30, 2026Updated last week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆67Jan 2, 2022Updated 4 years ago
- Streamlit template for building SMART on FHIR apps in the Cerner ecosystem.☆11Sep 22, 2023Updated 2 years ago
- Discussions on solving the 4Clojure Code challenges☆19Apr 12, 2025Updated last year
- The missing workspace tool for clojure tools.deps projects☆34May 25, 2026Updated 2 weeks ago
- Utility functions to support analytics over FHIR in BigQuery or Apache Spark☆15Jan 8, 2024Updated 2 years ago
- Code repository for the paper "A Deep Adversarial Framework for Visually Explainable Periocular Recognition" - CVPR 2021 Biometrics Works…☆16Feb 7, 2025Updated last year
- Prometheus Mailgun Exporter☆14Updated this week
- For Udemy students: the official repository of Rock the JVM's Spark Streaming course☆26Jan 5, 2023Updated 3 years ago
- Demo converting streamlit uber nyc rides to use duckdb☆30Apr 9, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆13Jun 18, 2018Updated 7 years ago
- ☆20Nov 17, 2024Updated last year
- The official repository for the Rock the JVM Spark Optimization 2 course☆43May 31, 2026Updated last week
- Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database☆14Oct 26, 2021Updated 4 years ago
- Security Manager for the Astronomer Airflow distribution☆11Jun 25, 2024Updated last year
- Playing with different packages of the Apache Spark☆29Feb 8, 2026Updated 4 months ago
- ☆13Apr 8, 2023Updated 3 years ago