morsapaes / pyflink-nlpLinks
Self-contained demo using PyFlink with Gensim+spaCy to find topics in the Flink User Mailing List. All you need is Docker! 🐳
☆21Updated 3 years ago
Alternatives and similar repositories for pyflink-nlp
Users that are interested in pyflink-nlp are comparing it to the libraries listed below
Sorting:
- Airflow training for the crunch conf☆105Updated 6 years ago
- Repo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' and MLFlow'☆121Updated 2 years ago
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆75Updated 3 years ago
- Delta Lake examples☆226Updated 9 months ago
- A repository of sample code to show data quality checking best practices using Airflow.☆77Updated 2 years ago
- Generate and Visualize Data Lineage from query history☆326Updated last year
- One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)☆120Updated 3 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆169Updated last year
- Repo for all my code on the articles I post on medium☆107Updated 2 years ago
- Grafana dashboards and StatsD exporter config for Airflow monitoring☆282Updated last year
- A workshop with several modules to help learn Feast, an open-source feature store☆92Updated last month
- Tutorial for setting up a Spark cluster running inside of Docker containers located on different machines☆133Updated 2 years ago
- To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a…☆37Updated last year
- ☆42Updated 3 years ago
- PySpark data-pipeline testing and CICD☆28Updated 4 years ago
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆242Updated this week
- Great Expectations Airflow operator☆167Updated this week
- ☆16Updated last year
- A Helm chart to install Apache Airflow on Kubernetes☆284Updated this week
- Creates simple data models on Snowflake to report dbt source freshness and tests☆26Updated 2 years ago
- ☆91Updated 6 months ago
- Apache Flink (Pyflink) and Related Projects☆40Updated 3 months ago
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆174Updated last month
- Tool to automate data quality checks on data pipelines☆254Updated 2 years ago
- ☆266Updated 8 months ago
- Airflow Unit Tests and Integration Tests☆260Updated 2 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Updated 4 years ago
- Self-contained demo using PyFlink with Gensim+spaCy to find topics in the Flink User Mailing List. All you need is Docker! 🐳☆10Updated 4 years ago
- (project & tutorial) dag pipeline tests + ci/cd setup☆88Updated 4 years ago