A list of publicly available datasets with real-time data maintained by the team at bytewax.io
☆2,484Apr 13, 2026Updated last month
Alternatives and similar repositories for awesome-public-real-time-datasets
Users that are interested in awesome-public-real-time-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.☆97Jan 21, 2024Updated 2 years ago
- A list of free datasets that provide streaming data☆441Apr 13, 2026Updated last month
- Scrapes Redfin data.☆98Aug 1, 2023Updated 2 years ago
- An ETL pipeline that extracts weather and air quality data from public APIs, transforms the data into a clean, analyzable format, and loa…☆45Sep 21, 2024Updated last year
- Python Stream Processing☆2,000May 20, 2026Updated last week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- An Awesome List of Open-Source Data Engineering Projects☆3,189Oct 4, 2024Updated last year
- This is a repo with links to everything you'd ever want to learn about data engineering☆41,426Apr 2, 2026Updated last month
- A Data Engineering project. Repository for backend infrastructure and Streamlit app files for a Premier League Dashboard.☆255Dec 19, 2025Updated 5 months ago
- A topic-centric list of HQ open datasets.☆75,618May 23, 2026Updated last week
- Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Jo…☆41,401May 3, 2026Updated 3 weeks ago
- End-to-End ELT data pipeline with Postgres, Airbyte, dbt, Dagster, Snowflake and Metabase☆11Jul 13, 2023Updated 2 years ago
- Clone of chatgpt built with Bytewax, Streamlit and NATS☆14Mar 2, 2023Updated 3 years ago
- Sample project to demonstrate data engineering best practices☆219Feb 24, 2024Updated 2 years ago
- Data Engineering Practice Problems☆2,699Jan 8, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Mock streaming data generator☆18May 31, 2024Updated last year
- Repository for Data Engineering Interview Series☆38Oct 17, 2024Updated last year
- The best place to learn data engineering. Built and maintained by the data engineering community.☆1,942May 20, 2026Updated last week
- ☆394Jan 26, 2025Updated last year
- Demo Project for Open Source MDS☆169Aug 27, 2025Updated 9 months ago
- ☆116Jan 15, 2025Updated last year
- rust-for-data☆53Jul 12, 2023Updated 2 years ago
- This is a template you can use for your next data engineering portfolio project.☆190Sep 10, 2021Updated 4 years ago
- Stream processing pipeline from Finnhub websocket using Spark, Kafka, Kubernetes and more☆434Nov 28, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A curated list of data engineering tools for software developers☆8,669May 14, 2026Updated 2 weeks ago
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆78Sep 2, 2023Updated 2 years ago
- Practical Data Engineering: A Hands-On Real-Estate Project Guide☆801Mar 10, 2026Updated 2 months ago
- Python Streaming DataFrames for Kafka☆1,553May 21, 2026Updated last week
- An end-to-end batch scoring machine learning system that produces hourly predictions of the number of arrivals and departures that will t…☆26May 21, 2026Updated last week
- dbt docs but windows 95☆16Jun 7, 2022Updated 3 years ago
- Analyze coinbase orderbook in real-time in Python with Bytewax☆11Apr 23, 2024Updated 2 years ago
- A list of all awesome open-source contributions for the Apache Kafka project☆111Jul 10, 2023Updated 2 years ago
- ☆40Mar 13, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB and Superset☆265Apr 5, 2026Updated last month
- Transaction processing & vis pipeline using PySpark Streaming☆29Jul 18, 2024Updated last year
- dbt-databend adapter plugin☆10May 30, 2024Updated 2 years ago
- Vectorized quantile backtesting library☆15May 25, 2023Updated 3 years ago
- A handpicked collection of resources for Python developers in data engineering, machine learning, and AI. Inside, you'll discover a neatl…☆138Apr 1, 2024Updated 2 years ago
- ☆19Jun 28, 2023Updated 2 years ago
- ☆17Jul 31, 2024Updated last year