gigisr / data_etl
☆10Updated 4 years ago
Alternatives and similar repositories for data_etl:
Users that are interested in data_etl are comparing it to the libraries listed below
- Set-oriented Operations in Pandas☆24Updated 4 years ago
- Search 'from' and 'to' strings to learn a text cleaning mapping☆17Updated 9 years ago
- this repo contains the draft, images, and code for the Medium blog post on altair themes.☆12Updated 6 years ago
- Today I Learned Some Computer Stuff☆39Updated 6 years ago
- ☆16Updated 8 months ago
- A browser user interface for manual labeling of record pairs.☆47Updated last year
- A maximum-strength name parser for record linkage.☆37Updated last week
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- ☆29Updated last year
- ☆15Updated 6 years ago
- Dask tutorial for PyData DC 2016☆11Updated 8 years ago
- Comparing Polars to Pandas and a small introduction☆43Updated 3 years ago
- An analysis of all 1.3 million public Jupyter Notebooks on Github in July 2017☆73Updated 7 years ago
- Public repository for versioning machine learning data☆42Updated 3 years ago
- Compilation of Vega-Lite & Altair Tutorials☆23Updated 2 years ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆35Updated 4 years ago
- All kinds of survival analysis distributions and methods to optimize how long to wait for them.☆39Updated 4 years ago
- View a list of JSON-serializable dictionaries or a 2-D array, in HandsOnTable, in Jupyter Notebook.☆13Updated 6 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated 2 years ago
- Multidimensional data explorer and visualization tool.☆56Updated 7 years ago
- Simple validator for submissions to DrivenData competitions☆19Updated 5 years ago
- Pipeline Explorer - Explore and analyze millions of pipelines learned using MLBlocks and MLPrimitives.☆17Updated last year
- Utility to help search within a set of jupyter notebooks☆16Updated 5 years ago
- A pedagogical implementation of panel apps served up on a remote machine.☆14Updated 3 years ago
- Generate Pandas frames, load and extract data, based on JSON Table Schema descriptors.☆52Updated 3 years ago
- Primrose modeling framework for simple production models☆32Updated last year
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- 🧬 A VS Code extension for annotating data with Prodigy☆30Updated 3 years ago
- Scalable String Similarity Joins in Python☆39Updated 9 months ago
- A simple tool to help building stacking models.☆18Updated last week