shauryashaurya / learn-data-mungingLinks
Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.
☆47Updated last week
Alternatives and similar repositories for learn-data-munging
Users that are interested in learn-data-munging are comparing it to the libraries listed below
Sorting:
- Code and materials for Effective Polars book☆81Updated last year
- Public notebooks and datasets to accompany the Data Analysis with Polars course on Udemy☆42Updated last year
- Cost Efficient Data Pipelines with DuckDB☆53Updated 3 weeks ago
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆11Updated last year
- Maternal Health Risk prediction MLOps pipeline☆43Updated 2 years ago
- Pandas Training © MetaSnake 2022, CC BY-NC☆18Updated 3 years ago
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 10 months ago
- ☆27Updated 3 years ago
- Intro to Polars Tutorial☆22Updated 2 years ago
- A FastMCP tool to search and retrieve Polars API documentation.☆60Updated last week
- A Series of Notebooks on how to start with Kafka and Python☆154Updated 3 months ago
- A collection of my favorite tech-related blog posts.☆10Updated last week
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- Deploy A/B testing infrastructure in a containerized microservice architecture for Machine Learning applications.☆40Updated 4 months ago
- ☆30Updated 11 months ago
- Demo of Streamlit application with Databricks SQL Endpoint☆35Updated 2 years ago
- Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟☆53Updated 3 years ago
- Repository for the book Simplifying Machine Learning with PyCaret.☆66Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆53Updated 4 years ago
- Example repo to kickstart integration with mlflow recipes.☆44Updated 3 months ago
- An end-to-end project on customer segmentation☆81Updated 2 years ago
- This is your friendly fitness assistant, which is a RAG applications built as a part of LLM Zoomcamp☆65Updated 8 months ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- Scaling Machine Learning in Three Week course in a collaboration with O'Reilly following the guidance of Adi Polak's book - Scaling Machi…☆23Updated 2 years ago
- ☆22Updated 2 years ago
- Demo on how to use Prefect with Docker☆25Updated 2 years ago
- ☆46Updated this week
- ☆34Updated 4 months ago
- The project completed for MLops Engineering Lab #1 by Team #1. See our wiki for more info☆16Updated 4 years ago
- Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in t…☆30Updated last year