zaratsian / Datasets
Interesting Public Datasets
☆12Updated last year
Alternatives and similar repositories for Datasets:
Users that are interested in Datasets are comparing it to the libraries listed below
- How to do data science with Optimus, Spark and Python.☆19Updated 5 years ago
- ☆19Updated 4 years ago
- Public Repo of my machine learning project to predict home prices☆12Updated 5 years ago
- TensorFlow implementations of several deep learning models (e.g. variational autoencoder, RNN, ...)☆37Updated 6 years ago
- Work for Mastering Large Datasets with Python☆19Updated 2 years ago
- Python library for efficient multi-threaded data processing, with the support for out-of-memory datasets.☆27Updated 5 years ago
- Predict whether a student will correctly answer a problem based on past performance using automated feature engineering☆32Updated 4 years ago
- Repository for medium article☆22Updated last year
- Pyspark in Google Colab: A simple machine learning (Linear Regression) model☆36Updated 6 years ago
- Test LightGBM's Dask integration on different cluster types☆12Updated 3 months ago
- E-Commerce Website A/B testing: Recommend which of two landing pages to keep based on A/B testing☆23Updated 7 years ago
- A few end to end examples that use data-describe☆16Updated last year
- Examples of how Python can speed up tasks that are cumbersome in Excel☆13Updated 8 years ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆37Updated 5 years ago
- Data analysis using numpy, pandas, matplotlib, seaborn, sqlite3, data wrangling☆31Updated 5 years ago
- Few tutorials on pandas, matplotlib and seaborn☆26Updated 8 years ago
- Materials for Machine Learning with H2O Open Platform at ODSC Masterclass Summit 2017☆12Updated 8 years ago
- Work related to the Airbus Ship Detection Challenge https://www.kaggle.com/c/airbus-ship-detection☆13Updated 6 years ago
- The goal of this repository is to detect the outliers for a dataset & see the impact of these outliers on predictive models☆23Updated 6 years ago
- Experimental library for sampling and validating scikit-learn parameters☆10Updated 6 years ago
- ☆40Updated 7 years ago
- Reddit Data Science Project Ideas☆10Updated 5 years ago
- Deployment of PyCaret pipeline and Streamlit app on GCP Kubernetes☆15Updated 4 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- Data Scientist code test☆19Updated 4 years ago
- Data Science for Good Projects☆49Updated 6 years ago
- Project template for highly effective data science workflows☆29Updated last year
- Spark NLP for Streamlit☆15Updated 3 years ago
- ☆13Updated 9 months ago
- ☆26Updated 5 years ago