shauryashaurya / learn-data-mungingLinks
Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.
☆47Updated last month
Alternatives and similar repositories for learn-data-munging
Users that are interested in learn-data-munging are comparing it to the libraries listed below
Sorting:
- Code and materials for Effective Polars book☆82Updated last year
- Pandas Training © MetaSnake 2022, CC BY-NC☆18Updated 3 years ago
- Public notebooks and datasets to accompany the Data Analysis with Polars course on Udemy☆42Updated last year
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- Demo on how to use Prefect with Docker☆25Updated 2 years ago
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆11Updated last year
- ☆28Updated 3 years ago
- Example FastAPI app deployed to AWS with CDK.☆16Updated 2 years ago
- A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics …☆20Updated 3 years ago
- Cost Efficient Data Pipelines with DuckDB☆54Updated last month
- Example repo to kickstart integration with mlflow recipes.☆44Updated 4 months ago
- An example MLFlow project☆48Updated 5 months ago
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 10 months ago
- ☆26Updated 3 years ago
- Datasets for ML, Analysis, etc☆62Updated 2 months ago
- ☆11Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆53Updated 4 years ago
- The getting started notebook for the DTC Zoomcamp Q&A challenge☆29Updated last year
- ☆30Updated 11 months ago
- Syllabus for Artificial Intelligence for Product Innovation Master of Engineering: https://ai.meng.duke.edu/degree☆32Updated 2 years ago
- A FastMCP tool to search and retrieve Polars API documentation.☆61Updated last month
- Essential PySpark for Scalable Data Analytics, published by Packt☆45Updated 2 years ago
- Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.☆130Updated last year
- Example project with a CNN to train a Pokémon type classifier, adapted for DTC workshop☆35Updated last year
- A repository of runnable examples using ibis☆44Updated 11 months ago
- Some example projects for Data Engineers to build, end-to-end.☆30Updated last year
- Minimalistic text search engine that uses sklearn and pandas☆25Updated last week
- csv and flat-file sniffer built in Rust.☆42Updated last year
- Intro to Polars Tutorial☆22Updated 2 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year