whitfin / s3-concatLinks
Concatenate Amazon S3 files remotely using flexible patterns
☆38Updated 4 years ago
Alternatives and similar repositories for s3-concat
Users that are interested in s3-concat are comparing it to the libraries listed below
Sorting:
- An HFile-backed Key-Value Server☆42Updated 6 years ago
- Gather metadata about your S3 buckets☆49Updated 4 years ago
- Run templatable playbooks of SQL scripts in series and parallel on Redshift, PostgreSQL, BigQuery and Snowflake☆81Updated 2 months ago
- UNRELEASED. An opinionated framework for analytics-on-write on event streams using key-value storage☆14Updated 9 years ago
- A way to choose things.☆10Updated 3 years ago
- Streaming left joins in Kafka for change data capture☆52Updated last year
- A Directed Acyclic Graph task dependency scheduler designed to simplify complex distributed pipelines☆131Updated 7 years ago
- Serverless query engine☆140Updated 2 years ago
- Parquet Command-line Tools☆19Updated 8 years ago
- Convert JSON files to Parquet using PyArrow☆97Updated last year
- dynamically parse protobuf message then convert to avro☆25Updated 10 years ago
- Use SQL to transform your avro schema/records☆28Updated 7 years ago
- JSONCDC is now maintained at,☆90Updated 7 years ago
- A system to programmatically run data pipelines☆221Updated 2 months ago
- Locality Sensitive Hashing using Golang and SQL database☆28Updated 9 years ago
- Timberlake is a Job Tracker for Hadoop.☆177Updated 5 years ago
- A Go package for sampling and reporting on random keys on a set of redis instances☆20Updated 10 years ago
- Compare eventual consistency of object stores☆173Updated last year
- A key/value store for serving static batch data☆175Updated 2 years ago
- Cantor provides utilities for estimating the cardinality of large sets.☆83Updated 3 years ago
- An analyzer for getting metrics about the contents of a Apache Kafka topic☆63Updated 4 years ago
- Provides a Pythonic interface for reading and writing Avro schemas☆27Updated 2 years ago
- Convert a CSV to a parquet file.☆64Updated 2 years ago
- ☆19Updated 7 years ago
- Remove bad records from a CSV file and normalize☆57Updated 3 years ago
- Data Catalog is a service for indexing parameterized, strongly-typed data artifacts across revisions. It also powers Flytes memoization s…☆53Updated last year
- ☆20Updated 3 years ago
- An apporximate frequency counter Redis module☆46Updated 6 years ago
- HyperMinHash: Bringing intersections to HyperLogLog☆305Updated 7 years ago
- Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline☆75Updated 2 years ago