cal-data-eng / sp21Links
Data Engineering Course Website
β14Updated last year
Alternatives and similar repositories for sp21
Users that are interested in sp21 are comparing it to the libraries listed below
Sorting:
- Convert monolithic Jupyter notebooks π into maintainable Ploomber pipelines. πβ79Updated last year
- Materials for my 2021 NYU class on NLP and ML Systems (Master of Engineering).β96Updated 2 years ago
- Smart Arguments Suite (smart-arg) is a slim and handy python lib that helps one work safely and conveniently with command line arguments.β23Updated 3 years ago
- Lazy Profiler is a simple utility to collect CPU, GPU, RAM and GPU Memory stats while the program is running.β35Updated 4 years ago
- Generate beautiful, testable documentation with Jupyter Notebooksβ21Updated 3 years ago
- Data science and ML with Daskβ14Updated 4 years ago
- Vinum is a SQL processor for Python, designed for data analysis workflows and in-memory analytics.β65Updated 4 years ago
- βοΈ Export Ploomber pipelines to Kubernetes (Argo), Airflow, AWS Batch, SLURM, and Kubeflow.β45Updated 6 months ago
- Data pipelines from re-usable componentsβ107Updated 2 years ago
- Comparing Polars to Pandas and a small introductionβ44Updated 4 years ago
- Public repository for the Search Fundamentals course taught by Daniel Tunkelang and Grant Ingersoll. Available at https://corise.com/courβ¦β46Updated last year
- A data wrangling and modeling tool.β63Updated 2 years ago
- A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics β¦β20Updated 3 years ago
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.β42Updated 2 years ago
- NitroML is a modular, portable, and scalable model-quality benchmarking framework for Machine Learning and Automated Machine Learning (Auβ¦β43Updated 4 years ago
- Magniv Core - A Python-decorator based job orchestration platform. Avoid responsibility handoffs by abstracting infra and DevOps.β80Updated last year
- Automated Jupyter notebook testing. πβ41Updated last year
- β31Updated last year
- Hypergol is a Data Science/Machine Learning productivity toolkit to accelerate any projects into production with autogenerated code, stanβ¦β53Updated 2 years ago
- An argument that Jupyter Notebooks are flawed and the world needs a successor.β81Updated 2 years ago
- Data and tooling to compare the API surfaces of various array libraries.β56Updated last month
- β17Updated last year
- The stupidest database of all time.β56Updated last month
- Dynamic Adversarial Benchmarking platformβ26Updated 3 years ago
- Supporting content (slides and exercises) for the Pearson video series covering best practices for developing scalable applications with β¦β52Updated 8 months ago
- Flenser is a simple, minimal, automated exploratory data analysis tool.β78Updated 4 months ago
- A minimal Python kernel so you can run Python in your Pythonβ39Updated 3 years ago
- real-time data + ML pipelineβ54Updated last week
- A utility for labeling clusters of text data.β28Updated 4 years ago
- A Vectorized Python Dict/Setβ118Updated 2 years ago