xdssio / big_data_benchmarks
big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.
☆65Updated 4 years ago
Alternatives and similar repositories for big_data_benchmarks:
Users that are interested in big_data_benchmarks are comparing it to the libraries listed below
- Automated Data Science and Machine Learning library to optimize workflow.☆104Updated 2 years ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆34Updated 4 years ago
- python library for automated dataset normalization☆114Updated last year
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM☆84Updated last year
- Hypergol is a Data Science/Machine Learning productivity toolkit to accelerate any projects into production with autogenerated code, stan…☆53Updated 2 years ago
- Deploy dask on YARN clusters☆69Updated 8 months ago
- General Interpretability Package☆58Updated 2 years ago
- Makes Interactive Chart Widget, Cleans raw data, Runs baseline models, Interactive hyperparameter tuning & tracking☆55Updated 3 years ago
- Talks about vaex☆36Updated 2 years ago
- A bit of extra usability for sqlalchemy v2.☆77Updated 10 months ago
- Interactive visualization of machine learning model evaluation metrics☆63Updated 5 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- Spark NLP for Streamlit☆15Updated 3 years ago
- NitroML is a modular, portable, and scalable model-quality benchmarking framework for Machine Learning and Automated Machine Learning (Au…☆43Updated 4 years ago
- Building an API with the FastAPI framework to serve a scikit-learn model.☆18Updated 6 years ago
- 🎛 Distributed machine learning made simple.☆49Updated 2 years ago
- A machine learning testing framework for sklearn and pandas. The goal is to help folks assess whether things have changed over time.☆102Updated 3 years ago
- asv benchmarks for dask projects☆18Updated 2 years ago
- Public repository made for Automated Feature Engineering workshop (Summer Data Conf, Odessa, 2018-07-21)☆19Updated 6 years ago
- Repository for the research and implementation of categorical encoding into a Featuretools-compatible Python library☆51Updated 2 years ago
- The fast.ai data ethics course☆15Updated 2 years ago
- A frictionless integrated platform for notebook☆85Updated 2 years ago
- ☆29Updated last year
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago
- Data Analysis Baseline Library☆131Updated 6 months ago
- ☆15Updated 2 years ago
- An abstraction layer for parameter tuning☆35Updated 7 months ago
- Pandas helper functions☆30Updated 2 years ago
- Predict the poverty of households in Costa Rica using automated feature engineering.☆23Updated 4 years ago