shauryashaurya / learn-data-mungingLinks
Notes on Data Engineering with Pandas, PySpark, Dask, Ray, Arrow DataFusion, Polars etc.
☆50Updated 3 months ago
Alternatives and similar repositories for learn-data-munging
Users that are interested in learn-data-munging are comparing it to the libraries listed below
Sorting:
- Code and materials for Effective Polars book☆83Updated last year
- Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.☆133Updated last year
- Cost Efficient Data Pipelines with DuckDB☆58Updated 5 months ago
- Example repo to kickstart integration with mlflow recipes.☆45Updated 3 months ago
- Public notebooks and datasets to accompany the Data Analysis with Polars course on Udemy☆45Updated 2 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated last week
- ☆28Updated 3 years ago
- A FastMCP tool to search and retrieve Polars API documentation.☆71Updated 5 months ago
- Duke MIDS: Data Engineering and DataOps Course☆67Updated 10 months ago
- Code samples for the Effective Data Science Infrastructure book☆115Updated 2 years ago
- Code for my "Efficient Data Processing in SQL" book.☆60Updated last year
- Intro to Polars Tutorial☆22Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆57Updated 4 years ago
- Syllabus for Artificial Intelligence for Product Innovation Master of Engineering: https://ai.meng.duke.edu/degree☆32Updated 2 years ago
- Scripts and datasets for the O'Reilly book Python Polars: The Definitive Guide☆274Updated 2 months ago
- ☆29Updated last year
- An example MLFlow project☆49Updated 10 months ago
- Construct a modern data stack and orchestration the workflows to create high quality data for analytics and ML applications.☆228Updated 3 years ago
- Reference code base for ML Engineering, Manning Publications☆131Updated 4 years ago
- Tutorials on creating a reproducible and maintainable data science project☆149Updated 3 years ago
- Slides for "Feature engineering for time series forecasting" talk☆62Updated 3 years ago
- Deploy A/B testing infrastructure in a containerized microservice architecture for Machine Learning applications.☆40Updated 10 months ago
- This is your friendly fitness assistant, which is a RAG applications built as a part of LLM Zoomcamp☆77Updated last year
- Demo for CI/CD in a machine learning project☆114Updated 2 years ago
- Notes from our NLP reading club!☆17Updated 4 years ago
- Polars Cookbook, Published by Packt☆353Updated 2 months ago
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo…☆86Updated 2 years ago
- MLOps maturity assessment☆61Updated 2 years ago
- Official code repo for the O'Reilly Book - Machine Learning for High-Risk Applications☆103Updated 2 years ago
- Repository for the book Simplifying Machine Learning with PyCaret.☆67Updated 2 years ago