danielbeach / PolarsVsPySparkLinks
can Polars crunch 27GBs of data faster than Pyspark?
☆13Updated 2 years ago
Alternatives and similar repositories for PolarsVsPySpark
Users that are interested in PolarsVsPySpark are comparing it to the libraries listed below
Sorting:
- Code and materials for Effective Polars book☆83Updated last year
- This repository contains coding interviews that I have encountered in company interviews☆12Updated 5 years ago
- Scripts and datasets for the O'Reilly book Python Polars: The Definitive Guide☆294Updated last month
- A repository of runnable examples using ibis☆46Updated last year
- Intro to Polars Tutorial☆22Updated 2 years ago
- A FastMCP tool to search and retrieve Polars API documentation.☆71Updated 7 months ago
- csv and flat-file sniffer built in Rust.☆44Updated last year
- Book documentation of the Polars DataFrame library☆189Updated 2 years ago
- Data Analysis with Polars, Published by Packt☆32Updated last year
- Polars Time Series Extension☆32Updated 2 months ago
- Polars Cookbook, Published by Packt☆356Updated 3 weeks ago
- ☆23Updated last year
- Code and data for the Modern Polars book☆230Updated last year
- Sentiment and language detection for text analytics.☆17Updated last year
- A simple and easy to use Data Quality (DQ) tool built with Python.☆51Updated 2 years ago
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence☆230Updated last month
- Fast and easy echarts with polars backend for wrangling and a simple API☆31Updated 3 weeks ago
- Code for my "Efficient Data Processing in SQL" book.☆60Updated last year
- ☆30Updated last year
- Polars Tutorial☆53Updated 2 years ago
- Cost Efficient Data Pipelines with DuckDB☆60Updated 7 months ago
- DuckDB with Dashboarding tools demo evidence, streamlit and rill☆21Updated 2 years ago
- The official repository of the book Data Storytelling with Python Altair and Generative AI☆29Updated last year
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated 2 months ago
- A portable Datamart and Business Intelligence suite built with Docker, sqlmesh + dbtcore, DuckDB and Superset☆55Updated 2 months ago
- Blog post on ETL pipelines with Airflow☆24Updated 4 months ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆119Updated 5 months ago
- This is a code repository for the course Data Engineering with Data Build Tool (DBT).☆70Updated last year
- Sample projects using Ploomber.☆86Updated last year