RWaltersMA / mongo-source-sinkLinks
This is an example of using MongoDB as both a source and sink.
☆10Updated 5 years ago
Alternatives and similar repositories for mongo-source-sink
Users that are interested in mongo-source-sink are comparing it to the libraries listed below
Sorting:
- This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark cluste…☆11Updated last year
- ☆27Updated 2 years ago
- ☆14Updated 2 years ago
- Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog☆12Updated last year
- ☆17Updated last year
- A complete end-to-end MLOps pipeline for Marvel character data.☆36Updated this week
- ☆23Updated 3 years ago
- This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) env…☆784Updated 6 months ago
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Updated 2 years ago
- Confluent Kafka questions to practice for CCDAK☆160Updated this week
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆38Updated last year
- This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon …☆18Updated 3 years ago
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆41Updated last year
- Learn the first step in Retrieval-Augmented Generation (RAG), how to vector encode incoming data to insert and continuously update your v…☆31Updated 6 months ago
- An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Ka…☆268Updated 6 months ago
- An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlD…☆14Updated last year
- Analyzing Spotify Data with Pyspark and ETL Procedures☆23Updated 10 months ago
- ☆68Updated this week
- Notebooks to learn Databricks Lakehouse Platform☆33Updated last month
- ☆21Updated last year
- Repo which holds the materials for the EMR Zero To Hero☆27Updated 3 years ago
- Resources for video demonstrations and blog posts related to DataOps on AWS☆181Updated 3 years ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆14Updated last year
- This repository contains the necessary configuration files and DAGs (Directed Acyclic Graphs) for setting up a robust data engineering en…☆21Updated last year
- Docker environment that spins up MongoDB replica set, Spark, and Jupyter Lab. Example code uses PySpark and the MongoDB Spark Connector.☆40Updated 2 years ago
- ☆34Updated last year
- ☆94Updated 11 months ago
- ☆13Updated 5 months ago
- This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆44Updated last year
- Apartments Data Pipeline using Airflow and Spark.☆21Updated 3 years ago