shauryashaurya / learn-data-mungingLinks
Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.
☆48Updated 3 weeks ago
Alternatives and similar repositories for learn-data-munging
Users that are interested in learn-data-munging are comparing it to the libraries listed below
Sorting:
- Code and materials for Effective Polars book☆82Updated last year
- Cost Efficient Data Pipelines with DuckDB☆56Updated 3 months ago
- Code for my "Efficient Data Processing in SQL" book.☆58Updated last year
- Public notebooks and datasets to accompany the Data Analysis with Polars course on Udemy☆42Updated 2 years ago
- Construct a modern data stack and orchestration the workflows to create high quality data for analytics and ML applications.☆223Updated 2 years ago
- ☆28Updated 3 years ago
- Syllabus for Artificial Intelligence for Product Innovation Master of Engineering: https://ai.meng.duke.edu/degree☆32Updated 2 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.☆131Updated last year
- Demo on how to use Prefect 2 in an ML project☆41Updated 2 years ago
- A FastMCP tool to search and retrieve Polars API documentation.☆66Updated 2 months ago
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆83Updated last year
- Intro to Polars Tutorial☆23Updated 2 years ago
- Polars Cookbook, Published by Packt☆340Updated 2 months ago
- ☆29Updated last year
- This repo is meant to make it really easy to analyze the interplays between health and social media use.☆44Updated 3 years ago
- Some example projects for Data Engineers to build, end-to-end.☆33Updated last year
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- Fetch, transform and plot real-time OHLC data from Coinbase using Bytewax, Bokeh and Streamlit☆129Updated last year
- Repository for the book Simplifying Machine Learning with PyCaret.☆66Updated 2 years ago
- It's all in the name☆81Updated 2 years ago
- An example MLFlow project☆48Updated 7 months ago
- Code samples for the Effective Data Science Infrastructure book☆115Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆55Updated 4 years ago
- Scripts and datasets for the O'Reilly book Python Polars: The Definitive Guide☆254Updated 3 months ago
- Duke MIDS: Data Engineering and DataOps Course☆67Updated 7 months ago
- Recohut - Learn data engineering, data science☆99Updated 2 years ago
- csv and flat-file sniffer built in Rust.☆42Updated last year
- This is your friendly fitness assistant, which is a RAG applications built as a part of LLM Zoomcamp☆72Updated 11 months ago
- Possibly the fastest DataFrame-agnostic quality check library in town.☆202Updated last week