t-ivanov / BigDataReadingLinks
List of papers, reports and links of materials on Big Data and related topics.
☆38Updated 8 years ago
Alternatives and similar repositories for BigDataReading
Users that are interested in BigDataReading are comparing it to the libraries listed below
Sorting:
- Code snippets from the Streaming Systems book (streamingbook.net).☆254Updated 3 years ago
- A curated list of awesome Apache Spark packages and resources.☆40Updated 8 years ago
- Apache Spark examples exclusively in Java☆102Updated 2 years ago
- Examples To Help You Learn Apache Spark☆77Updated 6 years ago
- Apache Flink™ training material website☆78Updated 5 years ago
- List of some interesting projects☆32Updated 5 years ago
- An extension of Yahoo's Benchmarks☆108Updated last year
- Real-world Spark pipelines examples☆83Updated 7 years ago
- Resource for the book Trino: The Definitive Guide (and formerly Presto: The Definitive Guide)☆229Updated 2 years ago
- Code samples for the book☆39Updated 11 years ago
- A series of Jupyter notebooks to demonstrate the functionality of Apache Calcite☆59Updated 5 years ago
- ☆41Updated 8 years ago
- An example Apache Beam project.☆111Updated 8 years ago
- ☆311Updated 6 years ago
- an anagram☆136Updated 4 years ago
- Basic getting started with Kafka examples☆47Updated 6 years ago
- A tutorial on how to get started with Presto.☆56Updated 3 years ago
- TPC-DS benchmarks including data generation with Spark and queries with Spark☆14Updated 8 years ago
- ☆31Updated 5 years ago
- Spark Terasort☆121Updated 2 years ago
- Quark is a data virtualization engine over analytic databases.☆100Updated 8 years ago
- All the things about TPC-DS in Apache Spark☆107Updated 2 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆90Updated last year
- ☆30Updated 2 months ago
- Magic to help Spark pipelines upgrade☆34Updated 11 months ago
- A tool to get better debug info on spark's memory usage☆42Updated 6 years ago
- The source code for this book: Grokking Streaming Systems: Real-time Event Processing (https://www.manning.com/books/grokking-streaming-s…☆110Updated last month
- Tools for building, packaging, and OAP public cloud integrations such as AWS EMR, Google Dataproc and K8S.☆17Updated last year
- Flowchart for debugging Spark applications☆107Updated 11 months ago
- A collection of examples to help show different ways to managing state in Apache Flink☆27Updated 6 years ago