drivendataorg / cookiecutter-data-science
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
☆8,186Updated last month
Related projects: ⓘ
- 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.☆12,385Updated last week
- Missing data visualization module for Python.☆3,896Updated 4 months ago
- Visual analysis and diagnostic tools to facilitate machine learning model selection.☆4,264Updated last year
- Modin: Scale your Pandas workflows by changing a single line of code☆9,747Updated this week
- Declarative statistical visualization library for Python☆9,227Updated this week
- 📚 Parameterize, execute, and analyze notebooks☆5,789Updated 3 weeks ago
- A game theoretic approach to explain the output of any machine learning model.☆22,506Updated this week
- Create delightful software with Jupyter Notebooks☆4,883Updated this week
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and…☆9,837Updated this week
- Statsmodels: statistical modeling and econometrics in Python☆9,978Updated this week
- Build and manage real-life ML, AI, and data science projects with ease!☆8,046Updated this week
- A library of extension and helper modules for Python's data analysis and machine learning libraries.☆4,862Updated 2 months ago
- An open source python library for automated feature engineering☆7,196Updated this week
- Voilà turns Jupyter notebooks into standalone web applications☆5,394Updated 2 weeks ago
- 🦉 ML Experiments and Data Management with Git☆13,608Updated this week
- A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.☆9,655Updated last month
- A curated list of references for MLOps☆12,465Updated 3 months ago
- Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts☆6,588Updated 2 weeks ago
- Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per s…☆8,257Updated last week
- Bayesian Modeling and Probabilistic Programming in Python☆8,633Updated this week
- Statistical data visualization in Python☆12,401Updated last month
- Parallel computing with task scheduling☆12,405Updated this week
- An open-source, low-code machine learning library in Python☆8,818Updated 2 weeks ago
- A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)☆7,214Updated 3 months ago
- A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning☆17,319Updated this week
- A simple and efficient tool to parallelize Pandas operations on all available CPUs☆3,634Updated 2 months ago
- Automatic extraction of relevant features from time series:☆8,361Updated last month
- aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-firs…☆26,674Updated 2 months ago
- A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning☆6,805Updated 3 months ago
- Lime: Explaining the predictions of any machine learning classifier☆11,516Updated last month