leeper / data-versioning
Collecting thoughts about data versioning
☆108Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for data-versioning
- A Python library for working with Data Packages.☆191Updated 8 months ago
- Google Container Engine, JupyterHub, and Jupyter for classroom scenarios☆59Updated 7 years ago
- A simple command line interface to the datamade/dedupe library.☆42Updated last year
- Repeatable analysis plugin for Jupyter notebook☆260Updated 2 years ago
- Multidimensional data explorer and visualization tool.☆52Updated 7 years ago
- A Jupyter Lab extension for rendering tabular data☆35Updated 6 years ago
- A kernel to support Python dataflows in the Jupyter Notebook environment☆119Updated last week
- Tools, wrappers, etc... for data science with a concentration on text processing☆206Updated 2 years ago
- Quick informal survey at the Los Angeles Machine learning meetup about tools used for machine learning.☆51Updated 9 years ago
- functional data manipulation for pandas☆200Updated 9 years ago
- A Binder-compatible repo with a requirements.txt file☆26Updated 7 years ago
- E. Tufte slope graph implementation in Python☆139Updated 9 years ago
- ☆85Updated 6 years ago
- SQLCell is a magic function for the Jupyter Notebook that executes raw, parallel, parameterized SQL queries with the ability to accept Py…☆151Updated 2 years ago
- Open source Flotilla☆192Updated last week
- Data Science box: Spark, Jupyter, R+RStudio, Zeppelin, Python 2 & 3, Java, Scala.☆39Updated 6 years ago
- A framework (comand line tool + libraries) for creating flexible compute pipelines☆56Updated 3 years ago
- Material for some talks I have given☆62Updated 2 months ago
- Framework for processing data packages in pipelines of modular components.☆119Updated last year
- Sample repo for luigi tasks & config☆36Updated 8 years ago
- Git Wrapper for Dataset Management☆15Updated last year
- beer recommendation engine project for Metis☆18Updated 2 years ago
- Tools for massively parallel and multi-variate data exploration☆39Updated 6 months ago
- DIT4C is a platform for hosting data analysis tools "in the cloud" using containers.☆40Updated 7 years ago
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.☆62Updated 7 years ago
- ggplot2 syntax in python. Actually wrapper around Wickham's ggplot2 in R☆73Updated 3 years ago
- An analysis of all 1.3 million public Jupyter Notebooks on Github in July 2017☆72Updated 6 years ago
- T4 is now in production as Quilt 3☆64Updated 5 years ago
- Python solver for mixed-effects models☆98Updated 6 years ago
- NOTE: a magic we developed at The Data Incubator from this basis: https://github.com/thedataincubator/ihtml☆88Updated 8 years ago