mozilla / python_mozetl
ETL jobs for Firefox Telemetry
☆28Updated 4 months ago
Alternatives and similar repositories for python_mozetl:
Users that are interested in python_mozetl are comparing it to the libraries listed below
- Aggregator job for Telemetry.☆8Updated last year
- Telemetry Analysis Service☆36Updated 5 years ago
- A library for creating full representations of Mozilla telemetry pings.☆11Updated last week
- Schemas for Mozilla's data ingestion pipeline and data lake outputs☆47Updated this week
- Spark Streaming ETL jobs for Mozilla Telemetry☆18Updated 5 years ago
- AWS bootstrap scripts for Mozilla's flavoured Spark setup.☆47Updated 4 years ago
- Collection of dockerized ETL jobs managed by data engineering.☆19Updated this week
- Spark bindings for Mozilla Telemetry☆14Updated last year
- Apache Airflow CI pipeline☆18Updated 5 years ago
- Repository for public analyses.☆5Updated 3 years ago
- ☆24Updated 4 years ago
- transformpy is a Python 2/3 module for doing transforms on "streams" of data☆29Updated 7 years ago
- Airflow workflow management platform chef cookbook.☆69Updated 5 years ago
- Provides a Pythonic interface for reading and writing Avro schemas☆27Updated 2 years ago
- Set of iPython and Jupyter extensions to improve user experience☆50Updated 5 years ago
- A toolset to streamline running spark python on EMR☆20Updated 8 years ago
- a declarative ETL framework that enforces data engineer best practices☆39Updated 7 years ago
- Airflow configuration for Telemetry☆185Updated this week
- An example PySpark project with pytest☆17Updated 7 years ago
- Helpers & syntactic sugar for PySpark.☆61Updated last year
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Infrastructure for making a pandas release☆7Updated 2 years ago
- A collection of datasets and databases☆24Updated 6 years ago
- ☆10Updated 6 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆52Updated 6 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- event-triggered plugins for airflow☆21Updated 5 years ago
- a diagnostic tool, in the form of Python library, for pyspark developers to debug and troubleshoot PySpark applications locally☆11Updated 3 months ago