xdssio / big_data_benchmarksLinks
big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.
☆65Updated 5 years ago
Alternatives and similar repositories for big_data_benchmarks
Users that are interested in big_data_benchmarks are comparing it to the libraries listed below
Sorting:
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆86Updated 2 years ago
- Automated Data Science and Machine Learning library to optimize workflow.☆105Updated 2 years ago
- General Interpretability Package☆58Updated 3 years ago
- Talks about vaex☆36Updated 3 years ago
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆52Updated 4 years ago
- Hypergol is a Data Science/Machine Learning productivity toolkit to accelerate any projects into production with autogenerated code, stan…☆53Updated 2 years ago
- Interactive visualization of machine learning model evaluation metrics☆66Updated 6 years ago
- A Pythonic introduction to methods for scaling your data science and machine learning work to larger datasets and larger models, using th…☆120Updated 3 years ago
- python library for automated dataset normalization☆117Updated 2 years ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆36Updated 5 years ago
- Data Analysis Baseline Library☆133Updated last year
- ForML - A development framework and MLOps platform for the lifecycle management of data science projects☆107Updated 2 years ago
- A machine learning testing framework for sklearn and pandas. The goal is to help folks assess whether things have changed over time.☆104Updated 2 weeks ago
- A package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn fr…☆58Updated 4 years ago
- JupyterLab extension to create GitHub commits & pull requests☆119Updated last year
- Tutorial for a new versioning Machine Learning pipeline☆80Updated 4 years ago
- Repository for the research and implementation of categorical encoding into a Featuretools-compatible Python library☆51Updated 3 years ago
- Pre-Modelling Analysis of the data, by doing various exploratory data analysis and Statistical Test.☆51Updated 2 years ago
- ☆78Updated 4 years ago
- A bit of extra usability for sqlalchemy v2.☆78Updated last year
- vtreat is a data frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. Distr…☆120Updated last year
- Makes Interactive Chart Widget, Cleans raw data, Runs baseline models, Interactive hyperparameter tuning & tracking☆55Updated 4 years ago
- MLOps simplified. One-stop AI delivery platform, all the features you need.☆106Updated last week
- Using Kafka-Python to illustrate a ML production pipeline☆112Updated 3 years ago
- Deploy dask on YARN clusters☆69Updated last year
- First class datasets in JupyterLab☆177Updated 2 years ago
- Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark☆74Updated 2 years ago
- 🐍💨 Airflow tutorial for PyCon 2019☆88Updated 3 years ago
- A web frontend for scheduling Jupyter notebook reports☆254Updated last year
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 4 years ago