Netflix / pygenie
☆72Updated last year
Related projects: ⓘ
- A toolkit providing a uniform interface for connecting to and extracting data from a wide variety of (potentially remote) data stores (in…☆252Updated 3 months ago
- ☆54Updated 5 years ago
- Fork of aio-libs/aiokafka☆26Updated 9 months ago
- Deploy dask on YARN clusters☆69Updated last month
- Airflow workflow management platform chef cookbook.☆67Updated 5 years ago
- transformpy is a Python 2/3 module for doing transforms on "streams" of data☆29Updated 7 years ago
- ETLy is an add-on dashboard service on top of Apache Airflow.☆69Updated last year
- Vertica dialect for SQLAlchemy using the vertica-python client☆18Updated 4 years ago
- A pandas.DataFrame-based ORM.☆84Updated 2 years ago
- IP Address dtype and block for pandas☆105Updated last year
- Set of iPython and Jupyter extensions to improve user experience☆50Updated 4 years ago
- SQLAlchemy dialect for Turbodbc☆23Updated 3 months ago
- Code Repository for the EVO-ODAS☆31Updated 6 years ago
- Python stream processing for humans☆184Updated 4 months ago
- Airflow plugin to transfer arbitrary files between operators☆78Updated 5 years ago
- Asynchronous actions for PySpark☆44Updated 2 years ago
- Utilities for creating ETL pipelines with mara☆36Updated 2 years ago
- A wrapper for libhdfs3 to interact with HDFS from Python☆136Updated 3 years ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆104Updated last week
- Data analysis and reporting tool for quick access to custom charts and tables in Jupyter Notebooks and in the shell.☆117Updated 8 months ago
- Snowplow event tracker for Python. Add analytics to your Python and Django apps, webapps and games☆42Updated 2 weeks ago
- Read better test failures.☆116Updated 2 months ago
- a declarative ETL framework that enforces data engineer best practices☆39Updated 7 years ago
- OlaPy, an experimental OLAP engine based on Pandas☆106Updated last year
- REST-like API exposing Airflow data and operations☆61Updated 5 years ago
- SQL on dataframes - pandas and dask☆64Updated 6 years ago
- Convert JSON files to Parquet using PyArrow☆94Updated 8 months ago
- A xlsx and html rendering library for rendering data available in Pandas DataFrames.☆26Updated 4 months ago
- Dockerized setup for testing code on realistic hadoop clusters☆27Updated 4 years ago
- Apache (Py)Spark type annotations (stub files).☆115Updated 2 years ago