drivendataorg / cookiecutter-data-science
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
β8,539Updated 3 weeks ago
Alternatives and similar repositories for cookiecutter-data-science:
Users that are interested in cookiecutter-data-science are comparing it to the libraries listed below
- 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.β12,687Updated this week
- Jupyter Notebooks as Markdown Documents, Julia, Python or R scriptsβ6,728Updated last month
- π Parameterize, execute, and analyze notebooksβ6,070Updated last month
- Modin: Scale your Pandas workflows by changing a single line of codeβ10,005Updated this week
- A library of extension and helper modules for Python's data analysis and machine learning libraries.β4,960Updated last week
- Data science interview questions and answersβ9,157Updated 5 months ago
- An open-source, low-code machine learning library in Pythonβ9,105Updated this week
- VoilΓ turns Jupyter notebooks into standalone web applicationsβ5,555Updated this week
- Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce,β¦β27,819Updated 10 months ago
- Statsmodels: statistical modeling and econometrics in Pythonβ10,413Updated last week
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering andβ¦β10,143Updated this week
- Create delightful software with Jupyter Notebooksβ5,008Updated last week
- Missing data visualization module for Python.β4,019Updated 8 months ago
- Declarative visualization library for Pythonβ9,555Updated this week
- Visual analysis and diagnostic tools to facilitate machine learning model selection.β4,313Updated 4 months ago
- Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per sβ¦β8,333Updated 3 months ago
- A collection of various notebook extensions for Jupyterβ5,256Updated 7 months ago
- Automatically visualize your pandas dataframe via a single print! π π‘β5,244Updated 10 months ago
- A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.β9,832Updated 6 months ago
- A helpful 5-page machine learning cheatsheet to assist with exam reviews, interview prep, and anything in-between.β5,067Updated last year
- Feature engineering package with sklearn like functionalityβ1,982Updated this week
- An Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani, 2013): Python codeβ4,278Updated 2 years ago
- A light-weight, flexible, and expressive statistical data testing libraryβ3,598Updated 2 weeks ago
- An open source python library for automated feature engineeringβ7,354Updated this week
- Parallel computing with task schedulingβ12,906Updated this week
- Source code for my collection of articles on using pandas.β1,544Updated 2 years ago
- A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)β7,293Updated 4 months ago
- STUMPY is a powerful and scalable Python library for modern time series analysisβ3,772Updated this week
- Automatic extraction of relevant features from time series: