jwoschitz / avrocount
Count records in Avro files efficiently
☆17Updated 2 years ago
Alternatives and similar repositories for avrocount:
Users that are interested in avrocount are comparing it to the libraries listed below
- Parquet Command-line Tools☆18Updated 8 years ago
- Ensime integration with Sublime Text 2 for Scala development☆141Updated 9 years ago
- Tools for working with parquet, impala, and hive☆134Updated 4 years ago
- Scala and Spark library focused on reading OpenStreetMap Pbf files.☆85Updated last month
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆72Updated 4 years ago
- Use SQL to transform your avro schema/records☆28Updated 7 years ago
- Offline Hadoop Elasticsearch Index Building and Tools For Lambda Architectures☆31Updated last year
- Run templatable playbooks of Hadoop/Spark/et al jobs on Amazon EMR☆19Updated last year
- Splittable Gzip codec for Hadoop☆70Updated last week
- ☆17Updated 8 years ago
- Space-Filling Curves in Scala☆26Updated 4 years ago
- Scheduled task execution on top of AWS Data Pipeline☆43Updated 10 years ago
- Examples on how to use the command line tools in Avro Tools to read and write Avro files☆155Updated 11 months ago
- something to help you spark☆65Updated 6 years ago
- Visualizer for Avro Schemas (.avsc) - Try it yourself at:☆33Updated 2 years ago
- Autoscaling EMR clusters and Kinesis streams on Amazon Web Services (AWS)☆47Updated last year
- cli app to create JSON objects☆26Updated 3 years ago
- Apache Flink cluster deployment in Docker containers using Docker-Compose☆18Updated 10 years ago
- A Giter8 template for scio☆31Updated 2 months ago
- GCS support for avro-tools, parquet-tools and protobuf☆75Updated 2 months ago
- Argument parsing in Scala☆83Updated 2 years ago
- Originally for monthly table partitions, more info at [imperialwicket.com](http://imperialwicket.com/postgresql-automating-monthly-table-…☆43Updated 9 years ago
- Tapalcatl is a "metatile server", which attempts to serve individual tiles extracted from an archive in storage.☆14Updated last year
- Scala client for MaxMind Geo-IP☆86Updated last year
- A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support…☆109Updated 7 years ago
- It's like the AWS SDK for Java, but more Scala-y☆73Updated 7 years ago
- A scala wrapper for the opencensus-java library☆52Updated 4 years ago
- Java/Scala library for easily authoring Flyte tasks and workflows☆44Updated 2 months ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Apache Spark AWS Lambda Executor (SAMBA)☆44Updated 6 years ago