RealImpactAnalytics / trumania
Trumania is a scenario-based random dataset generator library in python 3
☆112Updated 3 years ago
Alternatives and similar repositories for trumania:
Users that are interested in trumania are comparing it to the libraries listed below
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆103Updated 5 years ago
- Python ELT Studio, an application for building ELT (and ETL) data flows.☆57Updated 3 years ago
- Utilities for creating ETL pipelines with mara☆37Updated 2 years ago
- Dockerfiles for images used as part of the Orbyter toolset☆44Updated 10 months ago
- A series of workshop modules introducing Feast feature store.☆19Updated 2 years ago
- A hands-on tutorial showing how to use Python to do anonymisation with synthetic data☆78Updated 2 years ago
- Create HTML profiling reports from Apache Spark DataFrames☆195Updated 5 years ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago
- Spark implementation of computing Shapley Values using monte-carlo approximation☆74Updated 2 years ago
- The easiest way to integrate Kedro and Great Expectations☆53Updated 2 years ago
- Primrose modeling framework for simple production models☆33Updated last year
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆123Updated 3 years ago
- PySpark phonetic and string matching algorithms☆39Updated last year
- Read Delta tables without any Spark☆47Updated last year
- python library for automated dataset normalization☆113Updated last year
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).☆123Updated 10 months ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆34Updated 4 years ago
- Predict whether or not a patient will show up to their next appointment using automated feature engineering☆29Updated 4 years ago
- Repo demonstrating a Dagster pipeline to generate Neo4j Graph☆21Updated 3 years ago
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browser☆33Updated last year
- ☆106Updated 2 years ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆37Updated 5 years ago
- Code examples for the Introduction to Kubeflow course☆14Updated 4 years ago
- Machine Flow enables visual execution and tracking of machine learning workflows. Users dynamically create dependency graphs, with each n…☆62Updated 6 years ago
- Reference package for unit tests☆49Updated 6 years ago
- MLOps simplified. One platform, all the functionality you need. Swiss made☆98Updated this week
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆107Updated last week
- Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbation…☆164Updated last month
- A web frontend for scheduling Jupyter notebook reports☆252Updated 3 months ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated 11 months ago