jwoschitz / avrocountLinks
Count records in Avro files efficiently
☆18Updated 2 years ago
Alternatives and similar repositories for avrocount
Users that are interested in avrocount are comparing it to the libraries listed below
Sorting:
- Command line (CLI) tool to inspect Apache Parquet files on the go☆198Updated 2 years ago
- Parquet Command-line Tools☆19Updated 9 years ago
- Spark SQL magic command for Jupyter notebooks☆37Updated 4 years ago
- Airflow declarative DAGs via YAML☆133Updated 2 years ago
- Task Metrics Explorer☆14Updated 6 years ago
- A library that provides useful extensions to Apache Spark and PySpark.☆232Updated 3 weeks ago
- Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.☆300Updated 6 months ago
- python implementation of the parquet columnar file format.☆358Updated 4 years ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆70Updated 5 months ago
- Nested array transformation helper extensions for Apache Spark☆37Updated 2 years ago
- Template for Spark Projects☆103Updated last year
- ☕⛵WIP PySpark dependency management☆22Updated 7 years ago
- Dissecting data structures☆342Updated 2 months ago
- Drop-in replacement for Apache Spark UI☆401Updated this week
- Avro SerDe for Apache Spark structured APIs.☆241Updated 8 months ago
- easy install parquet-tools☆184Updated last year
- A library that brings useful functions from various modern database management systems to Apache Spark☆61Updated 2 years ago
- Airflow plugin to transfer arbitrary files between operators☆78Updated 7 years ago
- A Python client for Apache Livy, enabling use of remote Apache Spark clusters.☆70Updated 4 years ago
- A command-line tool for launching Apache Spark clusters.☆651Updated last year
- aws_s3 postgres extension to import/export data from/to s3 (compatible with aws_s3 extension on AWS RDS)☆175Updated last year
- pytest plugin to run the tests with support of pyspark☆88Updated 8 months ago
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)☆454Updated 3 weeks ago
- Spark functions to run popular phonetic and string matching algorithms☆59Updated 3 years ago
- A COBOL parser and Mainframe/EBCDIC data source for Apache Spark☆160Updated last week
- lakeview is a visibility tool for S3 based data lakes☆29Updated 6 months ago
- JSON schema parser for Apache Spark☆82Updated 3 years ago
- A giter8 template for Spark SBT projects☆72Updated 4 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆186Updated 3 months ago
- Apache (Py)Spark type annotations (stub files).☆118Updated 3 years ago