Parquet file generator
☆22Apr 17, 2018Updated 8 years ago
Alternatives and similar repositories for parquet-generator
Users that are interested in parquet-generator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Albis: High-Performance File Format for Big Data Systems☆21Jul 12, 2018Updated 7 years ago
- Code examples for my blog posts☆22Nov 7, 2018Updated 7 years ago
- 基于多线程与epoll的高并发TCP服务器☆11Aug 4, 2018Updated 7 years ago
- A Python library and command line utility for manipulating and plotting stellar lightcurves.☆10Jun 14, 2016Updated 9 years ago
- Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote pe…☆14Sep 18, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- benchmark-for-spark☆18May 7, 2025Updated 11 months ago
- Text Preprocessing in Python☆19Jan 15, 2017Updated 9 years ago
- Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream …☆22Feb 6, 2017Updated 9 years ago
- A research group at UCSD CSE focused on Advanced Data Analytics: data management and systems for ML/AI and data science.☆11Feb 27, 2026Updated last month
- Bridging Immutable and Mutable Abstractions for Distributed Data Analytics☆12May 15, 2019Updated 6 years ago
- ☆12Jul 18, 2025Updated 9 months ago
- Presto connector for Apache Kudu☆48Mar 22, 2019Updated 7 years ago
- native Rust implementation of Kafka protocol and api☆14Jun 13, 2023Updated 2 years ago
- Demo code for implementing and showcasing a Fraud Detection Engine with Apache Flink.☆33Oct 20, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Open-Channel SSD emulator using memory☆22Nov 1, 2017Updated 8 years ago
- A GameBoy Emulator written in Rust, written as a learning project for both☆10Jun 6, 2023Updated 2 years ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆131Dec 19, 2024Updated last year
- A set of tools for understanding F2FS usage of ZNS devices, which allow for identifying the on-device locations of files and inodes, mapp…☆20Jan 19, 2025Updated last year
- Large scale query engine benchmark☆99Apr 5, 2016Updated 10 years ago
- Scripts used to setup a Spark cluster on EC2☆21Mar 24, 2016Updated 10 years ago
- A versioned database inspired by Git☆16Dec 16, 2017Updated 8 years ago
- Framework for running macro benchmarks in a clustered environment☆25Aug 29, 2022Updated 3 years ago
- A research and review of techniques to provide a natural language interface to RDMS.☆10Dec 8, 2017Updated 8 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Python Repository of the Institute of Astronomy @ KU Leuven☆20Nov 5, 2020Updated 5 years ago
- A GUI application for testing GRPC services☆18Nov 20, 2023Updated 2 years ago
- The IBM Hyper Protect iOS SDK for CareKit is an addon for the CareKit framework that consumes IBM Hyper Protect Services for zero-trust p…☆13Sep 2, 2020Updated 5 years ago
- Jpak compression format☆15Mar 12, 2017Updated 9 years ago
- Example of building and running an eBPF program in Rust☆33Sep 27, 2018Updated 7 years ago
- ZNS Append-only based LSM key-value store☆21Sep 22, 2023Updated 2 years ago
- Code repository for Performance Characterization of NVMe Flash Devices with Zoned Namespaces (ZNS) (IEEE Cluster'23)☆22Mar 18, 2024Updated 2 years ago
- Fast, reliable, and scalable channels implementation based on Redis streams.☆11Jun 25, 2024Updated last year
- PLCT实验室2020年开放日活动的演讲资料☆13Dec 29, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A simple golang job queue☆13Jan 19, 2023Updated 3 years ago
- Apache Hadoop HDFS Data Node Scheduler☆13Jun 4, 2016Updated 9 years ago
- Run TPC-DS against different databases including Hive, Spark SQL and IBM BigSQL☆14Jan 4, 2022Updated 4 years ago
- Exposes Redis stream through the command line☆12Jun 28, 2022Updated 3 years ago
- hacking AlienFX... (under heavy development)☆14Apr 10, 2017Updated 9 years ago
- Linux kernel SGX driver for Graphene☆12Nov 3, 2020Updated 5 years ago
- An exploration of Flink and change-data-capture via flink-cdc-connectors☆11Jul 7, 2021Updated 4 years ago