Example Repo to have full end to end pyspark testing via docker-compose
☆31Feb 6, 2023Updated 3 years ago
Alternatives and similar repositories for pyspark-testing-env
Users that are interested in pyspark-testing-env are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- DLL and SO from Bruker.☆12Feb 25, 2026Updated 2 months ago
- Implementation of Boundary Attributions for Normal (Vector) Explanations☆11Aug 13, 2021Updated 4 years ago
- Data Engineer Roadmaps as Projects Funnel☆12Aug 10, 2022Updated 3 years ago
- Turn browser clicks into reproducible scraping code.☆11Oct 27, 2024Updated last year
- Data pipeline project using Data Factory, Databricks and Cosmosdb Graph, deployed using Azure DevOps, secured using firewalls and Azure A…☆11Dec 14, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Deploy a scikit model using heroku and Flask☆15May 1, 2023Updated 2 years ago
- A dumb utility to help you mirror your GitLab and GitHub contributions.☆14Apr 3, 2025Updated last year
- NaRnEA (Nonparametric analytical Rank-based Enrichment Analysis)☆11Mar 2, 2023Updated 3 years ago
- Samples for fabric user data functions☆27Mar 16, 2026Updated last month
- Match your fig size and font to conference formats.☆11Aug 16, 2021Updated 4 years ago
- Example project for building scalable data pipelines with Kedro and Ibis.☆14Dec 10, 2025Updated 4 months ago
- A cloud data platform product to accelerate time to insights. Our open-source framework is designed for the real world. Stripping away th…☆24Apr 21, 2026Updated last week
- Feedzai's theme for Altair charts.☆15Feb 2, 2026Updated 2 months ago
- A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation an…☆23Nov 21, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A simple script designed to run and use i2p and i2pd on tails os along with the tor network!☆21May 19, 2025Updated 11 months ago
- quadipy is a python package to help transform structured data into RDF graph format☆19Apr 14, 2023Updated 3 years ago
- Code for my "Efficient Data Processing in SQL" book.☆62Aug 6, 2024Updated last year
- A PoC script for adding dummy GitHub contributions to past dates☆11Nov 27, 2024Updated last year
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆29Nov 24, 2022Updated 3 years ago
- A fast data generator that produces CSV files from generated relational data☆43Aug 15, 2025Updated 8 months ago
- Your AI-powered terminal sidekick. Delegate commands, dodge syntax Googling, and let your terminal intern handle the grunt work. Smart, s…☆12Mar 10, 2025Updated last year
- A Terraform module to create and manage Identity and Access Management (IAM) Users on Amazon Web Services (AWS). https://aws.amazon.com/i…☆20Apr 6, 2022Updated 4 years ago
- ☆17May 22, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Nomad launcher/executor for Dagster☆22Oct 2, 2025Updated 6 months ago
- BPNet manuscript code.☆12Dec 1, 2020Updated 5 years ago
- functional genomic data integration☆10Sep 22, 2019Updated 6 years ago
- The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for sever…☆288Mar 4, 2026Updated last month
- Streamlit template for building SMART on FHIR apps in the Cerner ecosystem.☆11Sep 22, 2023Updated 2 years ago
- Rust + WebAssembly port of SymbolicRegression.jl☆36Mar 2, 2026Updated last month
- Utility functions to support analytics over FHIR in BigQuery or Apache Spark☆15Jan 8, 2024Updated 2 years ago
- For Udemy students: the official repository of Rock the JVM's Spark Streaming course☆26Jan 5, 2023Updated 3 years ago
- Demo converting streamlit uber nyc rides to use duckdb☆30Apr 9, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Data Engineering pipeline hosted entirely in the AWS ecosystem utilizing DocumentDB as the database☆14Oct 26, 2021Updated 4 years ago
- ☆13Apr 8, 2023Updated 3 years ago
- Chapter 8 of the AWS Cookbook☆12Apr 20, 2023Updated 3 years ago
- ☆12Feb 23, 2024Updated 2 years ago
- Sample repository to demonstrate Terraform module versioning using semantic-release.☆18Jul 5, 2021Updated 4 years ago
- Homeworks repository for the Big Data Analysis with Scala and Spark Coursera course☆15Jun 30, 2024Updated last year
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10May 12, 2023Updated 2 years ago