RealImpactAnalytics / trumania
Trumania is a scenario-based random dataset generator library in python 3
β112Updated 3 years ago
Alternatives and similar repositories for trumania
Users that are interested in trumania are comparing it to the libraries listed below
Sorting:
- Dockerfiles for images used as part of the Orbyter toolsetβ44Updated last year
- π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)β141Updated last year
- Build your feature store with macros right within your dbt repositoryβ38Updated 2 years ago
- Snowflake Guide: Building a Recommendation Engine Using Snowflake & Amazon SageMakerβ31Updated 3 years ago
- Common data science and data engineering utilities to help us perform analytics. Our toolbox for data scientists, licensed under Apache-2β¦β30Updated 6 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.β109Updated last week
- Utilities for creating ETL pipelines with maraβ36Updated 2 years ago
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.β103Updated 5 years ago
- Repo demonstrating a Dagster pipeline to generate Neo4j Graphβ21Updated 4 years ago
- Predict whether or not a patient will show up to their next appointment using automated feature engineeringβ29Updated 4 years ago
- A Scalable Data Cleaning Library for PySpark.β27Updated 6 years ago
- β110Updated 4 months ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.β125Updated 3 years ago
- Automated Data Science and Machine Learning library to optimize workflow.β104Updated 2 years ago
- The easiest way to integrate Kedro and Great Expectationsβ53Updated 2 years ago
- Using Luigi to create a Machine Learning Pipeline using the Rossman Sales data from Kaggleβ33Updated 8 years ago
- Record matching and entity resolution at scale in Sparkβ34Updated last year
- An Ubuntu Vagrant Virtual Machine (VM) with Airflow, a data workflow management system from Airbnbβ8Updated 4 years ago
- A Python package to build predictive linear and logistic regression models focused on performance and interpretationβ30Updated last year
- A series of workshop modules introducing Feast feature store.β19Updated 2 years ago
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browserβ33Updated 2 years ago
- PySpark phonetic and string matching algorithmsβ39Updated last year
- Interactive notebooks containing demonstration code of the splink libraryβ38Updated last year
- Automated Exploratory Data Analysis. Simplifying Data Explorationβ35Updated 4 years ago
- Capturing model drift and handling its response - Example webinarβ108Updated 5 years ago
- Type System for Data Analysis in Pythonβ212Updated 3 months ago
- Primrose modeling framework for simple production modelsβ32Updated last year
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasetsβ¦β45Updated 3 years ago
- Create HTML profiling reports from Apache Spark DataFramesβ196Updated 5 years ago
- Sample configuration to deploy a modern data platform.β88Updated 3 years ago