xdssio / big_data_benchmarks
big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.
☆65Updated 4 years ago
Alternatives and similar repositories for big_data_benchmarks
Users that are interested in big_data_benchmarks are comparing it to the libraries listed below
Sorting:
- Automated Data Science and Machine Learning library to optimize workflow.☆104Updated 2 years ago
- General Interpretability Package☆58Updated 2 years ago
- Talks about vaex☆36Updated 2 years ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆35Updated 4 years ago
- python library for automated dataset normalization☆115Updated last year
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆84Updated last year
- Makes Interactive Chart Widget, Cleans raw data, Runs baseline models, Interactive hyperparameter tuning & tracking☆55Updated 3 years ago
- A package for data science practitioners. This library implements a number of helpful, common data transformations with a scikit-learn fr…☆57Updated 3 years ago
- Deploy dask on YARN clusters☆69Updated 9 months ago
- Hypergol is a Data Science/Machine Learning productivity toolkit to accelerate any projects into production with autogenerated code, stan…☆53Updated 2 years ago
- Interactive visualization of machine learning model evaluation metrics☆63Updated 5 years ago
- A web frontend for scheduling Jupyter notebook reports☆252Updated 5 months ago
- NitroML is a modular, portable, and scalable model-quality benchmarking framework for Machine Learning and Automated Machine Learning (Au…☆43Updated 4 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- A bit of extra usability for sqlalchemy v2.☆77Updated 11 months ago
- An abstraction layer for parameter tuning☆35Updated 8 months ago
- A series of workshop modules introducing Feast feature store.☆19Updated 2 years ago
- Repository for the research and implementation of categorical encoding into a Featuretools-compatible Python library☆51Updated 2 years ago
- This project is created to promote and advocate the use of FOSS machine learning.☆45Updated last week
- Comparing Polars to Pandas and a small introduction☆43Updated 3 years ago
- ForML - A development framework and MLOps platform for the lifecycle management of data science projects☆106Updated last year
- ☆30Updated 3 years ago
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆53Updated 3 years ago
- 🐍💨 Airflow tutorial for PyCon 2019☆86Updated 2 years ago
- JupyterLab extension to create GitHub commits & pull requests☆119Updated 10 months ago
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Repo for PyData 2019 Tutorial - New Trends in Estimation and Inference☆25Updated 5 years ago
- Pandas helper functions☆30Updated 2 years ago