t-ivanov / BigDataReading
List of papers, reports and links of materials on Big Data and related topics.
☆37Updated 7 years ago
Related projects ⓘ
Alternatives and complementary repositories for BigDataReading
- List of some interesting projects☆33Updated 4 years ago
- Data sets and Vagrant script to provision a virtual machine for Apache Calcite development☆28Updated last year
- Spark Shuffle Optimization with RDMA+AEP☆30Updated last year
- A description of the processes and techniques required to migrate a relational schema to a Cassandra database using Spark and SparkSQL☆11Updated 6 years ago
- Running TPC-H on Apache Hive☆41Updated 5 years ago
- Demonstration of a Hive Input Format for Iceberg☆26Updated 3 years ago
- A curated list of awesome Apache Spark packages and resources.☆40Updated 7 years ago
- Apache Quickstep Incubator - This project is retired☆94Updated 5 years ago
- Code and setup information for Introduction to Machine Learning with Spark☆12Updated 9 years ago
- A library for Spark DataFrame using MinIO Select API☆96Updated 5 years ago
- All the things about TPC-DS in Apache Spark☆104Updated last year
- Examples of Spark 3.0☆47Updated 4 years ago
- Distributed systems lecture notes☆57Updated 3 weeks ago
- Real-world Spark pipelines examples☆83Updated 6 years ago
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆62Updated 6 months ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- A series of Jupyter notebooks to demonstrate the functionality of Apache Calcite☆54Updated 4 years ago
- Parquet file generator☆22Updated 6 years ago
- Dione - a Spark and HDFS indexing library☆50Updated 8 months ago
- Port of TPC-DS dsdgen to Java☆47Updated 3 months ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- Apache Flink™ training material website☆79Updated 4 years ago
- Sample processing code using Spark 2.1+ and Scala☆51Updated 4 years ago
- Splittable Gzip codec for Hadoop☆69Updated this week
- Foodmart data set in hsqldb format☆24Updated 5 months ago
- Port of TPC-H dbgen to Java☆46Updated last month
- This tutorial provides a quick introduction to using Spark☆57Updated 8 years ago
- Thoughts on things I find interesting.☆17Updated 3 years ago