RealImpactAnalytics / trumania
Trumania is a scenario-based random dataset generator library in python 3
☆111Updated 2 years ago
Alternatives and similar repositories for trumania:
Users that are interested in trumania are comparing it to the libraries listed below
- Dockerfiles for images used as part of the Orbyter toolset☆44Updated 8 months ago
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆103Updated 5 years ago
- Automated Data Science and Machine Learning library to optimize workflow.☆104Updated last year
- Predict whether or not a patient will show up to their next appointment using automated feature engineering☆29Updated 4 years ago
- A Scalable Data Cleaning Library for PySpark.☆26Updated 5 years ago
- python library for automated dataset normalization☆113Updated last year
- Repo demonstrating a Dagster pipeline to generate Neo4j Graph☆21Updated 3 years ago
- Create HTML profiling reports from Apache Spark DataFrames☆195Updated 4 years ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆34Updated 4 years ago
- A hands-on tutorial showing how to use Python to do anonymisation with synthetic data☆78Updated 2 years ago
- Machine Flow enables visual execution and tracking of machine learning workflows. Users dynamically create dependency graphs, with each n…☆62Updated 6 years ago
- Making Machine Learning Simple and Scalable with Python, Jupyter Notebook, TensorFlow, Keras, Apache Kafka and KSQL☆94Updated 6 years ago
- Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbation…☆165Updated 5 months ago
- Primrose modeling framework for simple production models☆33Updated 10 months ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆37Updated 5 years ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago
- Predict the poverty of households in Costa Rica using automated feature engineering.☆23Updated 4 years ago
- scaffold of Apache Airflow executing Docker containers☆85Updated 2 years ago
- Build your feature store with macros right within your dbt repository☆38Updated 2 years ago
- Public repository made for Automated Feature Engineering workshop (Summer Data Conf, Odessa, 2018-07-21)☆19Updated 6 years ago
- Python ELT Studio, an application for building ELT (and ETL) data flows.☆57Updated 3 years ago
- Python for people data☆67Updated 8 months ago
- A simple example of python api for real time machine learning, using scikit-learn, Flask and Docker☆133Updated last year
- Asynchronous actions for PySpark☆47Updated 3 years ago
- Tutorial code and data for the entity resolution workshops.☆44Updated 9 years ago
- OptimalFlow is an omni-ensemble and scalable automated machine learning Python toolkit, which uses Pipeline Cluster Traversal Experiments…☆27Updated last year
- Record matching and entity resolution at scale in Spark☆32Updated last year
- Accelerate data science☆117Updated 3 years ago
- Using Luigi to create a Machine Learning Pipeline using the Rossman Sales data from Kaggle☆33Updated 8 years ago
- Model drift detection☆11Updated last year