keypointt / readingLinks
collection of read materials
☆18Updated 5 years ago
Alternatives and similar repositories for reading
Users that are interested in reading are comparing it to the libraries listed below
Sorting:
- A library for Spark DataFrame using MinIO Select API☆98Updated 5 years ago
- Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.☆110Updated 2 years ago
- A high-performance, reliable and extensible logging agent for uploading data to Kafka, Pulsar, etc.☆182Updated last week
- DynoYARN is a framework to run simulated YARN clusters and workloads for YARN scale testing.☆60Updated 2 years ago
- A tool for scale and performance testing of HDFS with a specific focus on the NameNode.☆131Updated last year
- Explore Apache Kafka data pipelines in Kubernetes.☆46Updated 3 weeks ago
- Mirus is a cross data-center data replication tool for Apache Kafka☆205Updated last month
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆61Updated 7 months ago
- ☆34Updated 4 years ago
- Cache File System optimized for columnar formats and object stores☆183Updated 2 years ago
- The SpliceSQL Engine☆169Updated 2 years ago
- Database Benchmark Tool☆152Updated last year
- Avro2TF is designed to fill the gap of making users' training data ready to be consumed by deep learning training frameworks.☆128Updated 5 years ago
- Framework for running macro benchmarks in a clustered environment☆25Updated 2 years ago
- A schema store service that tracks and manages all the schemas used in the Data Pipeline☆87Updated 4 years ago
- Website for DataSketches.☆102Updated last month
- Java event logs collector for hadoop and frameworks☆40Updated 3 months ago
- A tool that help automate deployment to an Apache Flink cluster☆149Updated 5 years ago
- Data Catalog is a service for indexing parameterized, strongly-typed data artifacts across revisions. It also powers Flytes memoization s…☆54Updated last year
- Netflix Data Store Benchmark☆362Updated last year
- Lightweight proxy to expose the UI of an Apache Spark cluster that is behind a firewall☆98Updated 5 years ago
- ☆37Updated 6 years ago
- Splittable Gzip codec for Hadoop☆71Updated 3 weeks ago
- Ansible playbooks for Apache Spark on kube☆27Updated 8 years ago
- A curated list of awesome Apache Spark packages and resources.☆40Updated 8 years ago
- Extensions, custom & experimental panels☆53Updated 9 years ago
- Starburst Enterprise Distribution of Presto☆45Updated 3 years ago
- Myria is a scalable Analytics-as-a-Service platform based on relational algebra.☆116Updated 3 years ago
- an anagram☆136Updated 3 years ago
- Generic Model Serving Implementation leveraging Flink☆19Updated 6 years ago