spotify / luigiLinks
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
☆18,528Updated 5 months ago
Alternatives and similar repositories for luigi
Users that are interested in luigi are comparing it to the libraries listed below
Sorting:
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflows☆42,858Updated this week
- Parallel computing with task scheduling☆13,535Updated last week
- Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.☆27,933Updated 3 weeks ago
- Data-Centric Pipelines and Data Versioning☆6,259Updated 8 months ago
- Build, Manage and Deploy AI/ML Systems☆9,570Updated this week
- Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.☆27,895Updated this week
- A next-generation curated knowledge sharing platform for data scientists and other technical professions.☆5,528Updated last year
- Python Stream Processing☆6,824Updated last year
- Prefect is a workflow orchestration framework for building resilient data pipelines in Python.☆20,620Updated this week
- Simple job queues for Python☆10,389Updated last week
- An orchestration platform for the development, production, and observation of data assets.☆14,228Updated this week
- Data Apps & Dashboards for Python. No JavaScript Required.☆24,170Updated this week
- Distributed Task Queue (development branch)☆27,372Updated this week
- Utils for streaming large files (S3, HDFS, gzip, bz2...)☆3,377Updated last week
- 📚 Parameterize, execute, and analyze notebooks☆6,284Updated 2 weeks ago
- Apache Superset is a Data Visualization and Data Exploration Platform☆68,505Updated this week
- A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin☆6,462Updated 2 months ago
- Embrace the APIs of the future. Hug aims to make developing APIs as simple as possible, but no simpler.☆6,898Updated last year
- A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C proje…☆24,184Updated last month
- Always know what to expect from your data.☆10,832Updated last week
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆16,082Updated this week
- the portable Python dataframe library☆6,172Updated this week
- Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per s…☆8,436Updated 3 weeks ago
- Python datetimes made easy☆6,566Updated last month
- Python composable command line interface toolkit☆16,914Updated last week
- 🦉 Data Versioning and ML Experiments☆14,984Updated last week
- Declarative visualization library for Python☆10,066Updated last week
- H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random F…☆7,321Updated this week
- Apache Pinot - A realtime distributed OLAP datastore☆5,921Updated last week
- Machine Learning Toolkit for Kubernetes☆15,246Updated 2 months ago