This project implements an ELT (Extract - Load - Transform) data pipeline with the goodreads dataset, using dagster (orchestration), spark (calculation) and dbt (transformation)
☆43Apr 22, 2023Updated 2 years ago
Alternatives and similar repositories for goodreads-elt-pipeline
Users that are interested in goodreads-elt-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆44Mar 9, 2025Updated last year
- A simple playground for dbt with the sqlite connector☆12May 22, 2022Updated 3 years ago
- Add accent for Vietnamese. N-Grams + Beam search, LSTM, Transformer, Evolved Transformer☆18Feb 3, 2021Updated 5 years ago
- A dbt adapter for Apache Impala & Cloudera Data Platform☆24Mar 30, 2026Updated 3 weeks ago
- Open Data Stack Platform: a collection of projects and pipelines built with open data stack tools for scalable, observable data platform…☆22Mar 29, 2026Updated 3 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ScaleDP is an Open-Source extension of Apache Spark for Document Processing☆18Dec 2, 2025Updated 4 months ago
- ☆22Feb 5, 2024Updated 2 years ago
- Repo for learning DBT with Snowflake, featuring projects and models for data transformation and automation☆26Mar 31, 2025Updated last year
- API/Data Platform for Ingesting, Storing, and Serving Data through Postgres, and Litestar☆11Jan 18, 2026Updated 3 months ago
- ☆49Aug 14, 2024Updated last year
- Docktor is a Web App that deploys an easy-to-use kit of analysis and scanning tools.☆13Nov 1, 2023Updated 2 years ago
- An example of a Dagster project with a possible folder structure to organize the assets, jobs, repositories, schedules, and ops. Also has…☆101Nov 3, 2024Updated last year
- Template Dagster repo using poetry and a single Docker container; works well with CICD☆68Apr 1, 2022Updated 4 years ago
- Telegram bot using GPT4 API☆15Aug 26, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- We provide benchmark datasets for evaluating Vietnamese processing models: UIT-ViQuAD, ViNewsQA, UIT-VSFC, UIT-ViIC, UIT-ViNames, UIT-VSM…☆21Jun 19, 2021Updated 4 years ago
- End to end data pipeline to extract and analyze submissions from any subreddit using Pushshift, python, dbt and BigQuery.☆12Jul 17, 2023Updated 2 years ago
- Demonstrating the capabilities of DuckDB as a transformation engine for data lakes☆34Oct 8, 2024Updated last year
- Fivetran's Jira source dbt package☆14Oct 1, 2025Updated 6 months ago
- Feature Flags in dbt models☆35Apr 9, 2026Updated last week
- ☆16Mar 9, 2026Updated last month
- ☆18Oct 10, 2024Updated last year
- A fully serverless, event-driven data pipeline that ingests, enriches, validates, and visualizes real-time news data using AWS services. …☆25Aug 10, 2025Updated 8 months ago
- An example dbt project using AutomateDV to create a Data Vault 2.0 Data Warehouse based on the Snowflake TPC-H dataset.☆58Mar 22, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Sample http server and repo set up☆39Apr 12, 2026Updated last week
- editable table with angular and ng-zorro (ant design)☆16Dec 8, 2022Updated 3 years ago
- ZMK firmware for Urchin and Corne 36 keyboard with nice!nano and nice!view☆17Jan 16, 2026Updated 3 months ago
- ☆17Oct 15, 2021Updated 4 years ago
- "⼈⽣苦短, 使⽤ Python", Presentation materials PyCon KR 2018☆21Oct 31, 2019Updated 6 years ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆24Jul 14, 2022Updated 3 years ago
- Do you have a new MacBook, and are you a web developer who prefers working with languages like Python and TypeScript? This guide will ass…☆19Aug 20, 2025Updated 7 months ago
- ☆21Mar 26, 2023Updated 3 years ago
- Autoencoder for multi-label classification using Google's Tensorflow framework and MDMR for feature selection.☆10Aug 31, 2017Updated 8 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Nutshell is an enhanced Unix shell that provides a simplified command language, package management, and AI-powered assistance.☆24Mar 20, 2025Updated last year
- ☆29Dec 13, 2021Updated 4 years ago
- A turnkey MLOps pipeline demonstrating how to go from raw events to real-time predictions at scale.☆243Oct 21, 2025Updated 5 months ago
- This application guides you through the development of a language model that classifies clinical documents according to their medical spe…☆12Aug 12, 2024Updated last year
- Example end to end data engineering project.☆1,404Dec 8, 2022Updated 3 years ago
- source{d} MLonCode foundation - core algorithms and models.☆13Oct 17, 2019Updated 6 years ago
- Code for EACL 2023 paper "LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control"☆20Feb 7, 2023Updated 3 years ago