A list of free datasets that provide streaming data
☆436May 16, 2024Updated last year
Alternatives and similar repositories for awesome-public-streaming-datasets
Users that are interested in awesome-public-streaming-datasets are comparing it to the libraries listed below
Sorting:
- Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.☆94Jan 21, 2024Updated 2 years ago
- A list of publicly available datasets with real-time data maintained by the team at bytewax.io☆2,333Dec 21, 2025Updated 2 months ago
- Kubernetes deployment of PrestoDB, Hive Metastore, and Minio S3-standard object store☆17Oct 20, 2022Updated 3 years ago
- Blazing fast and flexible JSON database.☆24Jan 2, 2017Updated 9 years ago
- ☆17Sep 13, 2021Updated 4 years ago
- Gonudb is an append-only key/value datastore written in Go.☆20Dec 11, 2023Updated 2 years ago
- Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.☆535Jan 27, 2026Updated last month
- Markdown auto-formatting, beautification, and cleanup for Atom☆45Mar 4, 2023Updated 3 years ago
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆45Dec 11, 2023Updated 2 years ago
- This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you…☆11Nov 18, 2025Updated 3 months ago
- Trying out the Dataframe Polars library with Delta Lake ... feat Python.☆12Jan 29, 2025Updated last year
- ☆23Apr 2, 2017Updated 8 years ago
- Writing a sqlite clone from scratch in Rust (and Python3 for testing). Thanks @cstack !☆25Mar 17, 2019Updated 6 years ago
- A small library that allows to check if Go mutexes are locked☆27May 14, 2025Updated 9 months ago
- Tensorflow implementation of Neural Arithmetic Logic Unit, Trask et al.☆29Aug 4, 2018Updated 7 years ago
- HDFS Automatic Snapshot Service for Linux☆11Oct 17, 2016Updated 9 years ago
- Konzepte von Core-Java 8 werden durch beispiele illustriert. Java 8's core concepts are explained by examples.☆12Oct 12, 2018Updated 7 years ago
- Prescriptive Applications over Kite and Hadoop☆12Oct 14, 2015Updated 10 years ago
- A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, and GCP!☆12Jul 6, 2023Updated 2 years ago
- A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!☆857Apr 16, 2022Updated 3 years ago
- A book, Let's build a DBMS: StellarSQL -- a minimal SQL DBMS written in Rust☆27Nov 8, 2018Updated 7 years ago
- ☆15Oct 20, 2024Updated last year
- ☆14May 5, 2023Updated 2 years ago
- Full stack cloud applications that combine infrastructure as code and front end codebases for cohesive end to end applications and exampl…☆15Aug 17, 2020Updated 5 years ago
- Este é um projeto de exemplo que demonstra um processo de ETL (Extração, Transformação e Carga) de dados usando Python, Polars e AWS Loca…☆15Sep 25, 2023Updated 2 years ago
- resources for career development in data science☆16Jun 24, 2020Updated 5 years ago
- Simplify Big Data Analytics with Amazon EMR, published by Packt☆13Jan 18, 2023Updated 3 years ago
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆11Nov 18, 2023Updated 2 years ago
- A crowd sourced curriculum of mandatory material for new front-end devs.☆48Feb 8, 2016Updated 10 years ago
- A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.☆38Updated this week
- Criando Lambda Functions para Ingerir Dados de APIs com AWS CDK☆13Dec 1, 2021Updated 4 years ago
- A book about Maven in the style of the Pragmatic Guides published by The Pragmatic Bookshelf☆11Dec 12, 2015Updated 10 years ago
- Hadoop YARN & MapReduce Memory Calculator☆13Nov 9, 2015Updated 10 years ago
- Lifecycle helpers for loading and unmounting css☆15Jun 19, 2025Updated 8 months ago
- Source code for the post, 'Getting Started with Data Analysis on AWS, using S3, Glue, Amazon Athena, and QuickSight'☆29Dec 22, 2020Updated 5 years ago
- Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Jo…☆38,886Updated this week
- An Awesome List of Open-Source Data Engineering Projects☆3,037Oct 4, 2024Updated last year
- Go library for decoding generic map values and native Go structures into Arrow.☆17Jan 30, 2026Updated last month
- A walkthrough of setting up a Kinesis Data Analytics for Java Application which ingest streaming JSON data and leverages the Flink Table …☆16Aug 30, 2023Updated 2 years ago