dylan-profiler / compressioLinks
Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the same data.
☆29Updated 2 years ago
Alternatives and similar repositories for compressio
Users that are interested in compressio are comparing it to the libraries listed below
Sorting:
- Pipeline components that support partial_fit.☆46Updated 10 months ago
- The easiest way to integrate Kedro and Great Expectations☆52Updated 2 years ago
- captures logs and makes cron more fun☆76Updated 8 months ago
- Feature engineering library that helps you keep track of feature dependencies, documentation and schema☆28Updated 3 years ago
- Type System for Data Analysis in Python☆212Updated 3 months ago
- dagster scikit-learn pipeline example.☆43Updated 2 years ago
- Fuzzy joins for python pandas - easily join different datasets☆59Updated 4 years ago
- Python package for deduplication/entity resolution using active learning☆79Updated 9 months ago
- Automated Jupyter notebook testing. 📙☆41Updated last year
- SciKIt-learn Pipeline in PAndas☆42Updated last year
- An abstraction layer for parameter tuning☆35Updated 8 months ago
- kedro cli plugin for generating a static kedro viz site (html, css, js) that can be deployed on many serverless tools.☆27Updated 2 years ago
- Comparing Polars to Pandas and a small introduction☆44Updated 4 years ago
- A simple converter from SpaCy Entities (Spans) to Huggingface BILOU formatted data (tokens and ner_tags)☆14Updated 8 months ago
- Tools for making Prefect work better for typical data science workflows☆18Updated 3 years ago
- Primrose modeling framework for simple production models☆32Updated last year
- Set-oriented Operations in Pandas☆24Updated 5 years ago
- Build your feature store with macros right within your dbt repository☆38Updated 2 years ago
- Function dependencies resolution and execution☆70Updated 4 years ago
- Fast approximate joins on string columns for polars dataframes.☆12Updated 7 months ago
- Record matching and entity resolution at scale in Spark☆34Updated last year
- Dash component for Vega-Altair charts☆45Updated 9 months ago
- Pipeline definitions for managing data flows to power analytics at MIT Open Learning☆43Updated this week
- A collection of python utility functions☆11Updated 11 months ago
- A small Python module containing quick utility functions for standard ETL processes.☆35Updated last month
- Kedro-Accelerator speeds up pipelines by parallelizing I/O in the background.☆35Updated 3 years ago
- Kedro Wings automatically creates catalog entries to simplify Kedro pipeline writing. See the video here: https://www.youtube.com/watch?v…☆23Updated 2 years ago
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago
- Bag of, not words, but tricks!☆68Updated last year
- ☆10Updated 4 years ago