Eugene-Mark/bigdata-file-viewer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Eugene-Mark/bigdata-file-viewer)

Eugene-Mark / bigdata-file-viewer

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.

☆307

Alternatives and similar repositories for bigdata-file-viewer

Users that are interested in bigdata-file-viewer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

oap-project / sql-ds-cache
View on GitHub
Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.
☆37Jan 3, 2023Updated 3 years ago
haojinIntel / streaming_benchmark
View on GitHub
☆11Jun 30, 2026Updated 3 weeks ago
oap-project / oap-tools
View on GitHub
Tools for building, packaging, and OAP public cloud integrations such as AWS EMR, Google Dataproc and K8S.
☆18Mar 27, 2024Updated 2 years ago
maropu / spark-sql-server
View on GitHub
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
☆34Sep 8, 2022Updated 3 years ago
pishen / store4s
View on GitHub
A Scala library for Firestore in Datastore mode
☆13Jun 11, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
trivago / hive-lambda-sting
View on GitHub
A small library of hive UDFS using Macros to process and manipulate complex types
☆15Oct 2, 2025Updated 9 months ago
yaooqinn / itachi
View on GitHub
A library that brings useful functions from various modern database management systems to Apache Spark
☆63Sep 4, 2023Updated 2 years ago
mostr / scalaval
View on GitHub
Dead simple validation micro library (or micro framework) for Scala.
☆17Nov 5, 2014Updated 11 years ago
brooklyn-data / delta
View on GitHub
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…
☆10Feb 10, 2023Updated 3 years ago
oap-project / oap-mllib
View on GitHub
Optimized Spark package to accelerate machine learning algorithms in Apache Spark MLlib.
☆22Jul 6, 2026Updated 2 weeks ago
linkedin / coral
View on GitHub
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
☆907Updated this week
apache / parquet-java
View on GitHub
Apache Parquet Java
☆3,069Updated this week
mkim48 / TensorDB
View on GitHub
TensorDB: In-Database Tensor Manipulation with Tensor-Relational Query Plans
☆21Jul 25, 2014Updated 11 years ago
ebonnal / delta-lake-ui
View on GitHub
[student project] UI to run SQL on Delta Lake tables and visualize the variations of the result among tables versions
☆12Apr 21, 2020Updated 6 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
datafusion-contrib / datafusion-substrait
View on GitHub
Experimental support for serializing DataFusion plans using substrait
☆46Jan 13, 2023Updated 3 years ago
santikka / cfid
View on GitHub
cfid: R package for identifying counterfactuals.
☆11Dec 11, 2025Updated 7 months ago
alibaba / SparkCube
View on GitHub
SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.
☆136Mar 6, 2023Updated 3 years ago
delta-io / delta
View on GitHub
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…
☆8,919Updated this week
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
apache / hudi
View on GitHub
Upserts, Deletes And Incremental Processing on Big Data.
☆6,193Updated this week
cloudera / phoenix
View on GitHub
phoenix
☆12Oct 4, 2022Updated 3 years ago
projectnessie / iceberg-catalog-migrator
View on GitHub
CLI tool to bulk migrate the tables from one catalog another without a data copy
☆85Apr 12, 2025Updated last year
lensesio / datagen
View on GitHub
A small project to allow publishing data to Apache Kafka, Apache Pulsar or any other target system
☆16Sep 21, 2020Updated 5 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
minio / nifi-minio
View on GitHub
A custom ContentRepository implementation for NiFi to persist data to MinIO Object Storage
☆35Jul 15, 2022Updated 4 years ago
banzaicloud / spark-metrics
View on GitHub
Spark metrics related custom classes and sinks (e.g. Prometheus)
☆186Aug 2, 2022Updated 3 years ago
tosslab / aws_instance_scheduler
View on GitHub
The solution is can help reduce AWS operational costs for both development and production environments.
☆11Oct 1, 2017Updated 8 years ago
leesf / hudi-demos
View on GitHub
汇总Apache Hudi中的一些Demo，便于快速上手Apache Hudi(Apache Hudi Demos to help beginners know about Hudi)
☆74Sep 13, 2020Updated 5 years ago
theophilec / google-sheets-to-sqlite
View on GitHub
Google Sheets to SQLite CLI tool.
☆13Aug 15, 2023Updated 2 years ago
aperepel / nifi-workshop
View on GitHub
A complete custom processor project, for your reference.
☆17Sep 29, 2015Updated 10 years ago
SharpRay / spark-druid-connector
View on GitHub
A library for querying Druid data sources with Apache Spark
☆23Oct 28, 2020Updated 5 years ago
cldellow / csv2parquet
View on GitHub
Convert a CSV to a parquet file.
☆65Dec 8, 2022Updated 3 years ago
quantiply / grafana-druid-wikipedia
View on GitHub
Example using Grafana with Druid
☆11Mar 27, 2015Updated 11 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
apache / amoro
View on GitHub
Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.
☆1,149Updated this week
rymurr / flight-spark-source
View on GitHub
☆109Jul 5, 2023Updated 3 years ago
arisath / Benchmarking-Sorting-Algorithms
View on GitHub
Benchmarking the performance of different sorting algorithms implemented in Java
☆17Apr 20, 2020Updated 6 years ago
apache / auron
View on GitHub
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query process…
☆1,778Updated this week
exelban / AOSDownloader
View on GitHub
Apple OpenSource download tool
☆13Apr 17, 2020Updated 6 years ago
PRQL / prql-query
View on GitHub
Query and transform data with PRQL
☆136Sep 23, 2023Updated 2 years ago
apache / flink-connector-elasticsearch
View on GitHub
Apache Flink connector for ElasticSearch
☆95Jul 10, 2026Updated last week