felipeam86 / cachesql
Fast, resilient and reproducible data analysis with cached SQL queries
β30Updated last year
Related projects β
Alternatives and complementary repositories for cachesql
- πΎ PdpCLI is a pandas DataFrame processing CLI tool which enables you to build a pandas pipeline from a configuration file.β15Updated last year
- β29Updated 11 months ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ toβ¦β26Updated this week
- β21Updated 3 months ago
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.β73Updated 9 months ago
- Automated Jupyter notebook testing. πβ41Updated 9 months ago
- SciKIt-learn Pipeline in PAndasβ42Updated last year
- Comparing Polars to Pandas and a small introductionβ43Updated 3 years ago
- Decorators that logs stats.β105Updated last year
- WhyProfiler is a CPU profiler for Jupyter notebook that not only identifies hotspots but can suggest faster alternatives.β44Updated 2 years ago
- Vinum is a SQL processor for Python, designed for data analysis workflows and in-memory analytics.β65Updated 3 years ago
- A small python library that can clump lists of data together.β147Updated 2 years ago
- simple, flexible, offline capable, cloud storage with a Python path-like interfaceβ172Updated 5 months ago
- Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAMβ82Updated 10 months ago
- Feature engineering library that helps you keep track of feature dependencies, documentation and schemaβ28Updated 2 years ago
- Set-oriented Operations in Pandasβ24Updated 4 years ago
- Automated Exploratory Data Analysis. Simplifying Data Explorationβ34Updated 4 years ago
- Marshmallow Schema generator for Pandas DataFramesβ24Updated 4 years ago
- Simple Python code metering libraryβ31Updated 2 years ago
- Fuzzy joins for python pandas - easily join different datasetsβ59Updated 4 years ago
- ForML - A development framework and MLOps platform for the lifecycle management of data science projectsβ104Updated last year
- An abstraction layer for parameter tuningβ36Updated 2 months ago
- βοΈ Export Ploomber pipelines to Kubernetes (Argo), Airflow, AWS Batch, SLURM, and Kubeflow.β45Updated 2 months ago
- File processing pipelinesβ86Updated 2 years ago
- π¦ Deployment tool for online machine learning modelsβ97Updated 2 years ago
- FuturePool is a package that introduce known concept of multiprocessing Pool to the async/await world. It allows for easy translation froβ¦β12Updated last week
- Declarative layer for your database.β38Updated last year
- captures logs and makes cron more funβ71Updated 2 months ago
- Pandas helper functionsβ29Updated last year
- A collection of python utility functionsβ12Updated 4 months ago